ESSAYS  ON  LONG  RUN  AND  SHORT  RUN  DYNAMICS  IN  ECONOMICS 


By 


YIKANG  LI 


A DISSERTATION  PRESENTED  TO  THE  GRADUATE  SCHOOL 
OF  THE  UNIVERSITY  OF  FLORIDA  IN  PARTIAL  FULFILLMENT 
OF  THE  REQUIREMENTS  FOR  THE  DEGREE  OF 
DOCTOR  OF  PHILOSOPHY 


UNIVERSITY  OF  FLORIDA 
1995 


I dedicate  this  dissertation  to  the  faculty  of  the  Department  of  Economics  at  the  University 


of  Florida. 


ACKNOWLEDGMENTS 


I would  like  to  express  my  hearty  thanks  to  my  supervisor.  Professor  Mark  Rush.  His 
admirable  intuition  and  friendly  suggestions  with  encouragement  have  improved  not  only  this 
thesis  but  also  my  understanding  of  economics. 

A special  thank  you  goes  to  Professor  G.  S.  Maddala  for  his  encouragement,  guidance  and 
keen  intuition. 

Special  appreciation  goes  also  to  Professors  William  Bomberger,  David  Denslow,  Larry 
Kenny,  and  Mark  Flannery  for  their  encouragement  and  unfailing  support. 

The  education  and  guidance  I received  from  Professor  Jusheng  Zheng,  my  ex-supervisor 
in  China,  was  my  inspiration  during  my  study  in  Florida.  I am  sincerely  grateful  to  him. 

I appreciate  many  helpful  discussions  with  Dr.  Chunrong  Ai  and  Mr.  Li  Zhu. 

Last  but  not  least,  I would  thank  my  mother  and  my  wife,  and  all  other  family  members 


for  their  patience  and  encouragement. 


TABLE  OF  CONTENTS 


ACKNOWLEDGMENTS iii 

ABSTRACT v 

CHAPTER  1 INTRODUCTION  j 

CHAPTER  2 BOOTSTRAPPING  COINTEGRATING  REGRESSION  3 

2.1  Introduction  3 

2.2  The  Basic  Model  4 

2.3  The  Monte  Carlo  Experiment 4 

2.4  Simulation  Results  6 

2.5  Conclusion 9 

CHAPTER  3 LOW-PASS  FILTERED  LEAST  SQUARES  ESTIMATOR  OF 

COINTEGRATING  VECTORS  10 

3.1  Introduction 10 

3.2  The  Model  and  the  Estimators  13 

3.3  Asymptotic  Theory 23 

3.4  A Monte  Carlo  Study  26 

3.5  Conclusion 35 

CHAPTER  4 FILTERING  METHODOLOGY  AND  FIT  IN  DYNAMIC 

BUSINESS  CYCLE  MODELS 37 

4.1  Introduction 37 

4.2  Discussion  of  Current  Filtering  Methodology  in 

Business  Cycle  Model 41 

4.3  Monte  Carlo  Experiments  56 

4.4  Fit  of  an  RBC  Model  62 

4.5  Summary  and  Conclusions 72 

CHAPTER  5 CONCLUSIONS  82 

APPENDIX  84 

REFERENCE  LIST 95 

BIOGRAPHICAL  SKETCH 100 


IV 


Abstract  of  Dissertation  Presented  to  the  Graduate  School 
of  the  University  of  Florida  in  Partial  Fulfillment  of  the 
Requirements  for  the  Degree  of  Doctor  of  Philosophy 

ESSAYS  ON  LONG  RUN  AND  SHORT  RUN  DYNAMICS  IN  ECONOMICS 

By 

Yikang  Li 
August  1995 


Chairperson:  Dr.  Mark  Rush 
Major  Department:  Economics 

The  present  dissertation  consists  of  three  essays  that  explore  a variety  of  issues  concerning 
economic  growth  trends  and  business  cycles. 

The  first  essay  deals  with  inference  on  cointegrating  regression  by  bootstrapping.  This 
essay  contributes  to  the  literature  by  providing  criteria  to  judge  the  performance  of  asymptotic 
methods  with  small  samples.  In  particular,  this  essay  proposes  a recursive  bootstrap  method  to 
construct  an  appropriate  confidence  interval  for  the  cointegrating  parameter  in  the  exogenous 
regressor  case.  The  results  first  show  that  the  student-t  distribution  provides  poor  coverage  of  the 
confidence  interval  for  the  conventional  OLS  estimator.  However,  the  most  salient  finding  of  this 
essay  is  that  the  bootstrap  percentile-t  method  provides  reliable  confidence  intervals. 

The  second  essay  proposes  a low-pass  filter  method  to  estimate  cointegrating  vectors.  It 
has  been  proved  that  the  proposed  Filtered  Least  Squares  estimator  is  asymptotically  more 
efficient  than  the  OLS  estimator;  and  the  Filtered  Fully  Modified  Least  Squares  estimator  shares 
asymptotic  efficiency  with  the  Fully  Modified  Least  Squares  estimator.  The  Monte  Carlo  study 


v 


suggests  that  both  filtered  estimators  are  significantly  more  efficient  than  the  OLS  and  FMLS 
infinite  samples. 

The  third  essay  explores  the  detrending  issue  for  business  cycle  analysis.  This  essay 
examines  the  first  difference  detrending  method  and  the  conventional  HP  detrending  method. 
Based  on  the  analysis,  I propose  a two-step  band-pass  filtering  procedure  to  measure  the  fit  of 
dynamic  business  cycle  models.  By  conducting  this  procedure,  I can  obtain  the  standard  deviations 
and  the  correlations  contributed  by  any  suggested  frequency  band  width.  Finally,  it  is  pointed  out 
that  the  procedure  suggested  here  is  applicable  not  only  to  measuring  the  fit  of  calibrated  models, 
but  to  cases  that  require  the  detrending  or  the  isolation  of  a certain  range  of  frequencies  without 
imposing  any  prior  specifications. 


vi 


CHAPTER  1 
INTRODUCTION 


The  long  run  and  short  run  dynamic  behaviors  of  economic  systems  are  two  major 
research  topics  in  econometrics  and  macroeconomics.  Economists  often  refer  to  long  run  and  short 
run  dynamics  as  growth  trends  and  business  cycles,  respectively.  Over  recent  years,  the  growth 
literature  has  contributed  to  cointegration  theory  and  its  applications,  while  real  business  cycle 
models  have  become  an  important  stream  of  business  cycle  analysis.  This  dissertation  is  a 
collection  of  three  essays  that  explore  a variety  of  issues  concerning  growth  trends  and  business 
cycles. 

Since  Engle  and  Granger’s  (1987)  milestone  paper  on  cointegration,  much  research  has 
been  done  on  estimating  and  testing  parameters  in  the  context  of  cointegration.  However,  almost 
all  ot  the  literature  on  cointegration  is  asymptotic  in  nature.  Simulation  studies  (for  example, 
Baneijee  et  al.  (1986)  and  Boswijk  (1989))  have  found  that  some  asymptotic  methods  provide 
poor  approximations  for  the  small  sample  sizes  that  are  typical  in  economic  data.  In  this 
dissertation  I propose  a recursive  bootstrapping  method  to  find  the  empirical  distributions  of 
cointegrating  vectors  in  finite  samples. 

Recent  simulation  studies  (for  example,  Hansen  and  Phillips  (1990)  and  H.  Li  and  G.S. 
Maddala  (1994))  show  that  some  asymptotically  efficient  cointegrating  estimators  do  not  perform 
very  well  with  small  samples.  In  this  dissertation  I propose  a low-pass  spectral  filter  method  to 
estimate  cointegrating  vectors,  show  the  asymptotic  properties  of  the  proposed  low-pass  filter 
estimators,  report  the  related  Monte  Carlo  results,  and  argue  that  the  conventional  OLS  estimator 


1 


2 


and  the  Fully  Modified  Least  Squares  estimator  could  be  improved  by  the  low-pass  method. 

Since  the  publication  of  Kydland  and  Prescott’s  (1982)  pioneering  real  business  cycle 
paper,  the  use  of  calibrated  models  has  exploded  in  the  macroeconomics  literature.  An  important 
part  of  the  conventional  calibration  methodology  has  been  the  application  of  a filter,  usually  the 
HP  filter  but  sometimes  a difference  filter,  to  both  the  model-generated  and  the  actual  time  series 
data.  This  dissertation  examines  the  detrending  effects  by  using  a difference  filter,  the  HP  high- 
pass  filter  on  economic  time  series  and  argues  that  the  conventional  detrending  method  (the  HP 
high-pass  filter  with  A^=1600)  is  not  appropriate  in  conducting  business  cycle  analysis.  I suggest 
a two  step  band-pass  filtering  procedure  to  measure  the  fit  of  dynamic  business  cycle  models. 

Finally,  I point  out  that  the  procedure  suggested  here  is  applicable  not  only  to  measuring 
the  tit  of  calibrated  models,  but  to  cases  that  require  detrending  or  isolating  a certain  range  of 
frequencies  without  imposing  any  prior  specifications. 


CHAPTER  2 

BOOTSTRAPPING  COINTEGRATING  REGRESSION 
2.1  Introduction 

Since  Engle  and  Granger  (1987)  wrote  their  milestone  paper  on  cointegration,  much 
research  has  been  done  on  estimating  and  testing  parameters  in  the  context  of  cointegration 
(Phillips  (1987,  1991),  Stock  (1987)).  These  results,  however,  are  based  on  asymptotic  theory,  and 
large  samples  are  not  often  found  in  time  series  data.  Usually  only  short  spans  of  data  are 
available,  or  large  spans  are  broken  by  structural  changes.  Recently  the  small  sample  properties 
of  cointegrating  estimators  have  received  attention  (Baneijee  et  al.  (1986)  and  Boswijk  (1989)), 
but  some  properties  of  statistical  inference  of  the  estimators  in  small  samples  are  still  unknown. 
Consequently  it  seems  useful  to  explore  the  small  sample  properties  of  cointegrating  estimators 
with  an  alternative  approach,  the  bootstrap  methods  initiated  by  Efron  (1979,  1981,  1985). 

Recent  applications  of  bootstrapping  regressions  can  be  found  in  Shea  (1989)  and  Vinod 
and  McCullough  (1991).  In  these  models,  iid  error  structure  and  a fixed  regressor  (i.e.  conditional 
on  a realization  of  {x,})  are  assumed  so  that  the  iid  error  bootstrap  method  can  be  applied.  As 
Jeong  and  Maddala  (1992)  point  out.  however,  in  most  econometric  applications,  the  error 
structures  are  in  general  more  complicated.  Moreover  the  assumption  of  conditional  regressors 
deviates  from  the  essence  of  the  cointegrating  process.  The  main  purpose  of  this  paper  is  to 
provide  an  appropriate  bootstrap  method  based  on  reliable  coverage  of  confidence  intervals  for 
cointegration  regressions. 


3 


4 


2.2  The  Basic  Model 

One  direct  way  to  find  out  the  true  small  sample  distribution  of  the  cointegrating 
parameter  is  the  Monte  Carlo  sampling  method.  This  method  is  particularly  valuable  in  situations 
where  the  true  models  are  fully  known  to  us.  To  investigate  the  small  sample  properties  of  a few 
bootstrap  estimators  for  cointegration  regressions  with  exogenous  regressor,  we  use  following 
basic  model. 


y,  = (3x,  + u, , 

(2.1) 

x,  = x,4  + elt , elt  ~ iid  (0,  a2  ) 

(2.2) 

u,  = pu,.!  +e2t , Ip  Id  , e2t  ~ iid  (0,  a22 ) 

(2.3) 

In  this  model,  we  concentrate  on  the  case  that  the  regressor  x,  is  strictly  exogenous.  This  requires 
{elt  } to  be  generated  independently  of  {e2t  }.  In  general,  the  OLS  estimator  is  consistent  and  has 
a nonnormal  asymptotic  distribution.  Conditional  on  a realization  of  { x,  },  the  OLS  estimator, 
when  the  regressor  is  strictly  exogenous,  has  asymptotic  normality  ( Park  and  Phillips,  1989).  If 
not  conditional  on  the  realization  of  {x,  ),  (T2)  and  T~2  2 x2  converges 

t=l  P £= 1 C 

(weakly)  to  a stochastic  matrix,  in  contrast  to  the  standard  cases  where  7-1  2 x2  converges 

e=i 

(or  is  assumed  to  converge)  to  a constant  matrix  as  T tends  to  infinity. 

* 

2.3  The  Monte  Carlo  Experiment 

We  take  the  basic  model  as  the  data  generating  process.  In  the  process,  eIt  and  e^  are 
mutually  independently  generated  from  iid  N(0,1)  and  iid  N(0,0.25)  respectively.  (3=2,  p=0.1,  0.3, 
0.5,  0.7,  0.9.  Sample  size=20,  40.  The  initial  values  of  x,  and  u,  are  set  to  zero. 

Since  the  elements  of  u in  equation  (3)  are  serially  correlated,  the  Cochrane-Orcutt 
transformation  can  lead  to  reduced  efficiency  in  GLS  estimation  of  regression  coefficients.  So  the 


5 


Prais-Winsten  (PW)  transformation,  which  adjusts  the  first  observation  by  factor  (1-p2) 1/2 , is  used. 
The  PW  estimator  of  p is 

p=  E a:aj  E a,.  (2.4) 

t-2  t= 1 

Two  bootstrap  methods  are  considered. 

Method  I:  Treat  x,  as  a fixed  regressor  (i.e.  conditional  on  the  realization  of  {xt}).  The 
bootstrap  procedure  is  the  same  as  the  conventional  bootstrap  method  for  a linear  regression  model 
with  AR(1)  errors. 


Method  II:  Treat  x,  as  a random  regressor.  The  procedure  is  the  following: 

(i)  Compute  the  "uncorrelated"  residual,  ( g ) by  doing  the  conventional  PW 
transformation,  and  obtain  { g ) by  differencing  x,. 


(ii)  Resample  {g^}  and  { e2i } : obtain  { £ * } and  by  drawing  n times  with 

replacement  from  { ^ ) and  { g } respectively. 

(iii)  Construct  a pair  of  bootstrap  data  { y * , x * } by  the  formulas. 


x * -x  * 

A t A t-i  e i/ 


(2.5) 

(2.6) 


y *,=fa  *, 


+u 


(2.7) 


(iv)  Reestimate  by  Feasible  PW  GLS  in  which 
n n 

(3*=  E l r/TJ  E (2.8) 

t-2  t=l 

Steps  (ii)  to  (iv)  were  repeated  1000  times.  400  replications  were  simulated.  The  0.05th 
and  0.95th  quantiles  of  the  resulting  empirical  bootstrap  distribution  were  used  as  critical  values 


for  the  confidence  intervals. 


6 


2.4.  Simulation  Results 

Table  1 summarizes  the  results  from  our  Monte  Carlo  experiment.  All  the  estimates  of 
cointegrating  parameter,  (3,  their  standard  deviation,  and  the  true  coverages  of  a 90%  confidence 
interval  are  tabulated  for  respective  values  of  p and  n.  To  facilitate  comparison,  all  the  estimates 
and  the  coverage  for  the  bootstrap  method  I (treat  x,  as  fixed)  are  included.  These  values  are 
recorded  in  parentheses. 

Table  2.1  shows  the  following  comparisons,  (i)  the  coverage  of  the  t-distribution  goes 
down  drastically  as  p goes  up  and  decreases  as  sample  size  increases.  This  is  because  the  OLS 
estimator  has  a nonnormal  asymptotic  distribution,  which  has  been  studied  by  Stock  (1987).  (ii) 
Both  coverages  of  percentile  and  percentile-t  from  method  II  are  much  greater  than  the  t- 
distribution.  Compared  to  the  percentile  method,  the  percentile-t  has  a satisfyingly  stable  coverage 
probability  and  improves  coverage  accuracy.  This  result  is  consistent  with  Hall’s  finding  (1988) 
that  in  quite  general  situations  the  percentile-t  method  has  greater  coverage  than  other  percentile 
methods,  (iii)  The  coverages  for  bootstrap  method  I are  extremely  low,  which  implies  that  the 
bootstrap  method  conditional  on  x,  is  not  appropriate  for  cointegrating  regression  models. 
Bootstrap  method  I underestimates  the  variance  of  a bootstrap  estimator  from  a random  regressor 
model,  (iv)  While  the  coverage  of  the  t-distribution  decreases  as  sample  size  increases  from  20 
to  40,  the  coverage  of  bootstrap  method  II  (both  percentile  and  percentile-t)  improves. 

Table  2.2  presents  the  average  confidence  intervals  and  their  corresponding  lengths  for 
bootstrap  method  II.  That  indicates  the  following  points,  (i)  The  lengths  of  the  two  types  of 
confidence  interval  increase  uniformly  as  p increases,  (ii)  The  percentile-t  method  has  shorter 
confidence  intervals  than  that  of  percentile  method  while  the  former  has  greater  true  coverage  than 


7 

Table  2.1:  Comparisons  of  the  true  coverage  of  a 90%  confidence  interval.  ((3=2) 


p 

0.1 

0.3 

0.5 

0.7 

0.9 

OLS 

Bst* 

OLS 

Bst 

OLS 

Bst 

OLS 

Bst 

OLS 

Bst 

n=20 

Ave  Est** 

1.998 

1.997 

2.005 

2.003 

2.004 

2.002 

1.989 

1.988 

1.983 

1.957 

(1.997) 

(2.005) 

(2.004) 

(1.989) 

(1.983) 

Ave  SD 

0.059 

0.070 

0.079 

0.094 

0.093 

0.168 

0.132 

0.342 

0.268 

1.012 

(0.060) 

(0.082) 

(0.097) 

(0.140) 

(0.268) 

Coverage 

l-a=0.9 

t-dist. 

0.885 

0.790 

0.723 

0.575 

0.388 

Percentile 

0.860 

0.775 

0.755 

0.593 

0.470 

(0.145) 

(0.063) 

(0.033) 

(0.005) 

(0.000) 

Perc-t 

0.903 

0.838 

0.828 

0.745 

0.638 

(0.135) 

(0.075) 

(0.045) 

(0.010) 

(0.000) 

n=40 

Ave  Est 

1.998 

1.998 

2.000 

2.000 

1.998 

1.992 

2.016 

2.025 

1.972 

1.984 

(1.999) 

(2.000) 

(1.998) 

(2.016) 

(1.970) 

Ave  SD 

0.029 

0.034 

0.038 

0.050 

0.054 

0.084 

0.081 

0.198 

0.164 

1.075 

(0.029) 

(0.039) 

(0.056) 

(0.083) 

(0.172) 

Coverage 

l-a=0.9 

t-dist 

0.855 

0.780 

0.693 

0.510 

0.355 

Percentile 

0.845 

0.793 

0.765 

0.720 

0.523 

(0.085) 

(0.018) 

(0.025) 

(0.005) 

(0.005) 

Perc-t 

0.910 

0.880 

0.868 

0.845 

0.835 

(0.078) 

(0.025) 

(0.020) 

(0.009) 

(0.008) 

* Values  in  parentheses  correspond  to  simulation  results  obtained  from  bootstrap  method  I. 

**  Ave  Est  and  Ave  SD  denote  the  average  estimates  for  the  cointegratmg  parameter,  0,  and  the  average  standard  deviation  of  the 
estimates  of  p. 


8 


Table  2.2 

Average  confidence  interval  and  length  of  confidence  interval  for  bootstrap  method  II 


Percentile 

Percentile-t 

p 

Confidence  Interval 

Length 

Confidence  Interval 

Length 

n=20 

0.1 

(1.902,  2.090) 

0.188 

(1.904,  2.089) 

0.185 

0.3 

(1.888,  2.116) 

0.228 

(1.893,  2.112) 

0.219 

0.5 

(1.842.  2.160) 

0.318 

(1.855,  2.147) 

0.292 

0.7 

(1.419.  2.493) 

0.543 

(1.530.  2.394) 

0.466 

0.9 

(1.419,  2.493) 

1.074 

(1.530,  2.394) 

0.864 

n=40 

0.1 

(1.961.  2.048) 

0.087 

(1.960,  2.046) 

0.086 

0.3 

(1.938.  2.061) 

0.123 

(1.938,  2.060) 

0.121 

0.5 

(1.887.  2.096) 

0.209 

(1.894.2.091) 

0.197 

0.7 

(1.832,  2.220) 

0.388 

(1.850,  2.197) 

0.347 

0.9 

(1.427.  2.537) 

1.110 

(1.531.  1418) 

0.887 

9 


the  latter  for  all  p and  n (also  see  Table  2.1).  (iii)  The  lengths  for  n=40  axe  shorter  than  for  n=20 
except  for  p = 0.9. 


2.5  Conclusion 

This  study  shows  that  one  should  not  bootstrap  co integrating  regression  models  conditional 
on  a realization  of  {x,}.  Instead,  one  should  bootstrap  (yt,  x,)  to  make  appropriate  inferences  about 
estimates  of  the  cointegrating  parameter,  and  the  percentile-t  method  provides  reliable  confidence 
intervals. 


CHAPTER  3 

LOW-PASS  FILTERED  LEAST  SQUARES  ESTIMATORS  OF 
COINTEGRATING  VECTORS 

3.1  Introduction 

Filter  techniques  have  been  used  for  a long  time  in  estimating  the  states  of  stochastic 
dynamic  systems  or  in  extracting  information  from  noisy  observations.  The  works  of  Wiener 
(1949)  and  Kalman  (1960)  are  associated  with  advances  in  filtering  theory.  These  techniques  are 
also  widely  adopted  in  economics.  Economists  have  a long  history  of  employing  the  moving 
average  method  to  extract  long  run  trends  from  empirical  data.  More  recently,  macroeconomists 
have  used  the  Hodrick-Prescott  high-pass  filter  to  extract  high  frequency  bands,  the  business  cycle 
components,  from  macroeconomic  data  and  simulated  data. 

Filter  techniques  have  also  been  shown  to  be  useful  in  estimating  coefficients  of 
econometric  models.  The  principle  advantage  of  applying  filter  methods  is  that  the  techniques 
make  it  possible  to  isolate  the  most  relevant  frequencies.  For  example,  Friedman’s  permanent 
income  hypothesis  was  tested  by  Engle  (1974)  by  estimating  the  transitory  (high  frequency)  and 
the  permanent  (low  frequency)  MPCs  through  a filter  device.  Harvey  et  al.  (1992)  proposed  a 
Quasi-maximum  Likelihood  method  based  on  the  Kalman  filter  to  estimate  unobserved  component 
time  series  models  with  ARCH  disturbances.  So  far.  however,  most  of  the  theory  and  the 
applications  are  developed  for  models  in  which  the  time  series  are  non-cointegrated.  In  contrast, 
the  purpose  of  this  paper  is  to  show  how  filtering  may  improve  cointegression  analysis. 


10 


11 


Since  Engle  and  Granger  (1987)’s  seminal  paper  on  cointegration,  several  new  methods 
of  estimating  the  cointegrating  vector  have  emerged  in  the  literature,  [e.g.  ordinary  least  squares 
(OLS)  by  Engle  and  Granger  (1987),  nonlinear  least  squares  by  Stock  (1987),  principal 
components  by  Stock  and  Watson  (1988),  canonical  correlations  by  Park  (1992),  maximum 
likelihood  in  a fully  specified  error  correction  model  (ML)  by  Johansen  (1988),  fully  modified 
least  squares  (FMLS)  by  Phillips  and  Hansen  (1990)  and  spectral  regression  (SR)  by  Phillips 
(1991a)].  While  all  these  estimators  for  cointegrating  vectors  share  the  super-consistent  property 
(the  rate  of  convergence  is  T instead  of  T m ),  only  a few  are  asymptotically  efficient  [i.e.  ML, 
FMLS,  and  SR], 

Because  of  the  presence  of  unit  root  processes  and  endogeneity,  super-consistency 
provides  little  information  on  sampling  behavior.  Simulation  studies  provide  some  insight  into  the 
finite  sample  behavior  of  these  estimators.  Baneijee  et  al.  (1986)  first  observed  that  the  super- 
consistency of  OLS  in  cointegrating  regressions  was  misleading  in  small  samples.  Phillips  and 
Hansen  (1991)  compared  the  empirical  distributions  of  the  OLS  estimator,  Hendry  error-correction 
estimator,  and  the  FMLS  estimator  in  their  simulation.  They  concluded  that  the  asymptotics  are 
not  only  relevant,  but  also  seem  to  furnish  good  discriminatory  power  for  samples  as  small  as  50. 
A Monte  Carlo  study  by  Gonzalo  (1994)  suggested  that  the  asymptotic  properties  of  the  five 
compared  estimators  ( least  squares,  nonlinear  least  squares,  maximum  likelihood  in  an  error 
correction  model,  principle  component,  and  canonical  correlation ) are  still  valid  for  finite  samples. 

However,  the  explanatory  power  of  the  asymptotics  for  finite  sample  behavior  is  actually 
very  limited.  All  the  above  simulation  studies  revealed  that  the  properties  of  the  finite  sample 
distributions  depend  crucially  on  the  parameters  in  the  data  generating  processes.  For  example, 
in  a related  simulation  study,  Hansen  and  Phillips  (1990)  concluded  that  the  signal  to  noise  ratio 
is  a more  important  factor  affecting  finite  sample  behavior  than  the  degree  of  long  run 


12 


endogeneity.  If  the  ratio  is  high,  the  bias  problem  is  negligible,  and  OLS  works  well.  If,  however, 
the  ratio  is  low,  some  other  estimator  may  be  necessary  to  use  as  a first  stage  estimate  for  the 
FMLS  estimator.  Li,  Maddala  and  Rush  (1995)  proposed  a low-pass  filter  method  to  estimate  the 
cointegrating  vectors.  Their  simulation  study  showed  that  for  low  signal  to  noise  ratio  cases,  the 
low-pass  filter  estimators  worked  quite  well.  This  paper,  in  a more  general  way,  develops  the  low- 
pass  filter  technique  and  shows  the  asymptotic  properties  of  the  proposed  estimators  and  their 
finite  sample  performances. 

In  this  paper  we  adopt  Phillips’  triangular  ECM  representation  of  a cointegrated  system. 
By  doing  so,  we  can  enjoy  the  advantages  of  full  information  estimation  of  cointegrated  systems. 
That  implies  that  an  optimal  asymptotic  theory  of  inference  applies,  and  that  the  standard  chi- 
squared  distribution  may  be  used  to  test  the  cointegration  hypothesis. 

This  paper  shows  that,  like  the  spectral  regression  method,  the  filter  method  has  the 
advantage  of  permitting  a nonparametric  treatment  of  the  regression  errors.  Therefore,  the  filter 
method  avoids  the  problem  of  specifying  the  dynamics  of  the  system.  Moreover,  the  filter  method 
provides  a class  of  low-pass  filter  estimators  which  share  the  same  asymptotical  efficiency  as  ML, 
FMLS  and  SR. 

This  paper  is  organized  as  follows.  Section  2 specifies  a triangular  cointegrated  system 
and  a class  of  low-pass  filter  estimators.  Section  3 develops  the  asymptotic  theory.  Section  4 
investigates  the  results  of  a Monte  Carlo  study.  The  conclusions  are  provided  in  Section  5 and  the 
proofs  of  the  propositions  are  relegated  to  the  appendix.  The  following  notations  are  used  for 
facility  in  reading.  We  use  the  symbol  "=>"  to  signify  convergence  in  distribution  and  the  symbol 
= to  signify  equality  in  distribution.  Stochastic  processes  such  as  the  standard  Brownian  motion 
B(r)  are  frequently  written  as  B to  simplify  notation.  Similarly,  we  denote  the  Lebesgue  integral 
l0  B(r)  dr  as  more  simply  as  |0'  B.  Vector  Brownian  motion  with  covariance  matrix  £2  is  denoted 


13 


as  BM(Q).  Finally,  all  limits  given  in  the  paper  are  taken  as  the  sample  size  T — ^ oo  unless 
otherwise  stated. 


3.2  The  Model  and  the  Estimators 

In  this  paper  we  use  the  block  triangular  cointegrated  system  introduced  by  Phillips 
(1991a).  The  model  is 


Xlt:  = Q'x2c  + Ulc'  <3-D 

Ax2c-u2t,  (3.2) 


where 


x 


t 


i 

=J(1) 

m 


is  an  integrated  vector  process  with  dimension  n=m+ 1 and  where 


u 


t 


1(0) 


is  stationary  process  with  continuous  spectral  density  matrix  (<&)  > 0 over  -K«b  < k.  Here, 
equation  (3.1)  is  a single  cointegrating  regression  equatioa  This  approach  may  be  easily  extended 
to  a multiple  cointegrating  regression  system,  in  which  (3  is  then  a matrix  of  coefficients. 


14 


To  simplify  notation  for  the  ECM  presentation,  we  assume  that  the  partial  sum  process 
P,  = I,1  Uj  satisfies  the  multivariate  invariance  principle 

r'1/2pt*r]~fl(r)  * 0<rzl,  (3.3) 


where  £=2nfuu(0).  We  decompose  the  "long  run"  covariance  matrix  £ as  follows: 


U^+A+A.', 


where 


H=E(u0i^)  , 

Also,  we  define 


A=  2 E(ucu!k)  . 

k~i 


A=2I+A. 

In  addition  to  equation  (3.3),  we  make  the  conventional  assumption  of  weak  convergence  of  the 
stochastic  process  constructed  from  the  sample  covariance  between  and  u,. 


* i if  J , r r 

T'1  2 £ *4-  f B dtf+rA. 

l c Jo 


(3.4) 


For  conditions  under  which  (3.3)  and  (3.4)  hold,  see  Phillips  (1988)  and  the  references  cited 


there. 


Before  transforming  the  model,  we  make  the  following  assumption. 


15 


Assumption  3.2.1  (a)  h(L)  and  H(L)  are  two  sided  univariate  and  multivariate 

symmetric  polynomials  in  lag  operator  L,  respectively: 

h[L)  = 2 hiLj 


and 


HU ) = 2 Hi  L* 


where 


hj. 


Note  that  h(L)  and  H(L)  posses  roots  only  outside  the  unit  circle  in  terms  of  lags  and  leads.1 
Next,  the  iy  s satisfy  the  following  conditions: 


(a) 


#{j;hj*0,  j>M}=°°, 


(bl) 


2 I Aj  ! <«>, 

>— 


(b2) 


h (l)  — 2 1 

j—  J 


Since  h(L)  and  H(L)  are  two  sided  symmetric  polynomials,  the  above  statement  is  equivalent 
to  saying  that  they  have  reciprocal  roots  in  terms  of  lags  only. 


16 


and  H(l)  =2/^=J 


and 

(c)  h^ajbj, 

where  — =c<  1 , and  Lb.ki. 

Condition  (a)  ensures  that  the  filters  are  not  truncated  at  any  finite  lag.  (bl)  is  the 
conventional  convergence  condition.  This  condition  guarantees  the  convergence  of  the  polynomials 

at  ILI=1.  Condition  (b2)  could  be  derived  from  condition  (bl),  £ j^;<<30  , by  normalization. 


For  convenience  we  make  (b2)  an  explicit  assumption.  Condition  (c)  is  slightly  stronger  than  (bl). 
Also,  we  define  hp[L)  and  Hp(L)  as  the  truncated  pm  order  polynomials  from 


h{L)  and  #(L)  , i.e., 


hp(L)  = £ h]Lj , HAL)  = £ H.Ll 

J—p  j— p J 


Transforming  (3.1)  and  (3.2)  by  hp(L)  and  Hp(L),  respectively,  we  obtain 


17 


^it=P  yzt+vit 


and 


A>r2t=v’2t 


where 


y i c -^p  ( ■£< ) t / 


vlt=hp(L)ulc  • 


y 2 c Hp  ( l ) Xj  t , 


For  this  transformed  system, 


=£(L)  uc 


(3.5) 


(3.6) 


where 

[fy  0 


Since  h(L),H(L)  are  absolutely  summable.  H(L)  is  also  absolutely  summable. 
The  next  proposition  ensures  that  the  transformation  does  not  alter  the  cointegrating 


property  of  the  original  system. 


18 


Proposition  3.2 2 The  triangular  system  composed  of  (3. 5)  and  (3.6)  is  still  a cointegrated 
system  and  the  error  process  v , is  a stationary  process  with  continuous  spectral  matrix 

=|G(g>)  | 2 fuu  ( to ) . where  G is  the  gain  function  of  h(l)  ■ 

Thus,  we  can  assume  that  the  partial  sum  process  p satisfies  the  multivariate 

t x.  j 

invariance  principle 


where  Q =2  tt  f w ( 0 ) We  decompose  the  "long  run"  covariance  matrix  £2  as  its  counterpart,  £2: 


and  define  A=E+A  • We  dius  also  assume  weak  convergence  of  the  process  with  sample 
covariance  between  P,  and  v , : 


It  is  convenient  to  partition  the  new  Brownian  motion  B and  the  matrices 
£2,  S,  A,  A conformably  with  the  vector  yt.  For  example,  we  shall  write 


r1/2p[Tr]~B(r)  =flar(Q) , o<rsi. 


(3.7) 


fl=S+A+A/ 


2_°n  °21 

a21  ^22 


19 


and  so  on.  We  also  define  (l>ii  2=a»11-G»a1Q 

To  get  the  FMLS  estimator  , we  subtract 
arriving  at 


22^21  • 

<&2i  ,£222'lAy2t  from  both  sides  of  equation  (3.5), 


yit=P/y2t+vit> 


where 


yiWiC-«2iQ22Ay2t 

v't=vlt-d2iCl-2\Ay2t 

Thus,  the  OLS  and  FMLS  estimators  from  the  filtered  transformed  system  (5)  and  (6)  are, 

P 0^(Sy2y2t)'l(Er2cylt)  (3-8) 


P^fEjr^tl-ME  y^-TS.^)  (3.9) 

It  is  worth  noting  that  condition  (a)  in  Assumption  2.1  is  not  trivial  for  nonstationary 
cases.  With  stationary  time  series,  one  may  use  a finite  moving  average  filter  to  extract  the  useful 
low  frequency  component  without  losing  asymptotic  efficiency.  However,  for  nonstationary  time 
series,  the  non-zero  restriction  on  hj  (i.e.  hj  *0)  is  necessary  in  order  to  obtain  the  transferred 
system  without  losing  any  useful  information.  It  is  clear  that  any  truncation  of  hj  (i.e.,  hj  *0,  j < 
M,  and  hj  = 0,  j >M.)  will  distort  the  underlying  Brownian  motion.  This  is  due  to  the  fact  that  the 


20 


zero  values  of  hj  ( j > M ) make  the  transformed  series  no  longer  a partial  sum  process.  On  the 
other  hand,  conventional  finite  low-pass  filters  can  not  be  extended  to  infinite  filters.  For  example. 


in  Assumption  3.2.1  because  the  polynomial  will  not  converge  as  p goes  to  infinity.  Fortunately, 
a few  low-  pass  filters  with  the  non-zero  restriction  satisfy  the  strong  convergence  condition  stated 
in  Assumption  3.2.1(c).  Two  alternative  low-pass  filters  may  be  used  in  practice.  One  is  the 
exponential  smoothing  low-pass  filter  (ES)  employed  by  Lucas  (1980)  in  his  investigation  of  the 
quantity  theory  of  money  and  Rush  and  Husted  (1985)  in  their  study  of  Purchasing  Power  Parity. 
The  other  low-pass  filter  is  well  known  for  its  counterpart,  the  high-pass  filter  , Hodrick-Prescott 
(HP)  filter,  which  was  introduced  by  Hodrick  and  Prescott  (1980),  and  has  since  been  widely 
used  in  real  business  cycle  analysis.  These  two  filters  can  be  obtained  as  the  respective  solutions 
to  the  following  minimization  problems. 


the  finite  low-pass  filter,  }j  - 


p 


— i — £ lp  can  not  be  extended  to  the  infinite  filter  defined 

2P+1  j—p 


T 


(3.10) 


T 


mm  2 [(y  T )2+^  T 

/T  \ T+l  C= 1 C t+1 

r t) 


T*  1 C=1 

e=o 


-Tt)  -(Tt-Tw)  ]2]  - 


(3.11) 


where  yt  is  a nonstationary  series  and  x t is  a stochastic  trend.  The  solutions  to  each  of  these  two 


21 


minimization  programs  depend  on  the  parameter  0 <X  < « that  "penalizes"  variability  from  the 
first  difference  of  the  stochastic  trend  in  equation  (3.10)  or  from  the  second  difference  of  the 
stochastic  trend  in  equation  (3.11). 

King  and  Rebelo  (1993)  derived  the  expressions  for  the  two  filters  in  the  time  domain. 
Corresponding  to  equations  (3.10)  and  (3.11),  we  have 


[2  6*yt.,+  s8‘ 

1 ~02  3=0  3=0 


C*SJ 


(3.12) 


where  0=  { ( l+2A,)-[(  1+2A,)2-4A,2  ]1/2}/(2X), 
and 


t.[®A 


(3.13) 


where  and  9t  and  9,  are  the  solutions  to  the  first  order  conditions  for 


equation  (3. 11),  A.[  ( 1-L)  2 (1-L'1)  2 + — ] =0  • It  is  easy  to  see  that  0,  and  02  are  complex 

A> 

conjugates,  whose  values  depend  on  A.  ( 10;  I < 1 ).  A,  and  A2  are  also  complex  conjugates.  They 

6 

are  related  to  0,  s by  = [ ( 1 - ~ ) ( 1 -0  ) 2 ( l -0  0. ) ] *1  ■ Because  all  the  characteristics 

0,  12 


defined  in  Assumption  3.2.1  are  fulfilled,  the  legitimacy  of  these  proposed  transformation  of  a 


22 

cointegrated  system  can  be  verified.  We  state  this  as  a Corollary  for  future  reference. 

Corollary  3.2  J (a)  The  exponential  smoothing  filter  implied  in  equation  (3.12)  is  a 
valid  low-pass  filter  for  a cointegrated  system  in  equations  (3.5)  and  (3.6). 

( b ) The  HP  filter  implied  in  equation  (3.13)  is  a valid  low-pass  filter  for  the  cointegrated 
system  in  equations  (3.5)  and  (3.6). 


It  is  straightforward  to  get  the  gain  functions  of  these  low-pass  filters  from  the 
corresponding  gain  functions  of  the  high-pass  filters  derived  in  King  and  Rebelo  (1993): 


Ges  ( w ) 


1 

1 +2 A,  [1  -cos  ( to ) ] ' 


(3.14) 


GOT( &>)  = i . (3.15) 

1 +4 A,  [1-cos  (to)  ] 2 

These  two  gain  functions  have  unit  gain  at  zero  frequency  and  decrease  monotonically  over  [0. 
7t]  . Decreasing  A.  shifts  the  gain  function  upward,  moving  a given  frequency’s  gain  closer  to 
unity. 

Infinite  low-pass  filters,  like  the  exponential  smoothing  filter  and  the  HP  low-pass  filter, 
function  as  trend  extraction  devices.  Since  our  focus  of  interest  is  on  the  long  run  coefficients,  that 
is,  the  cointegrating  vector  a’  =(1,  -(3’),  the  information  contained  in  the  stochastic  trends  is 
relevant  to  estimating  the  cointegrating  vector.  The  trend  extraction  mechanism  may  provide  a 
more  favorable  base  upon  which  we  can  use  the  information  more  efficiently. 


23 


3.3.  Asymptotic  Theory 

Proposition  3.3.1  r(Pois-p)  ( ^B2dBx)  (3.16) 

We  know  that  the  OLS  estimator  for  the  original  system  has  the  following  asymptotic 
behavior  ( see  Phillips  and  Durlauf  (1986)  ) 

r(fWP)  - (/a^)  -1  ( /a2<ia1+A21)  0.17) 

Notice  that  the  condition  (b)  in  Assumption  3.2.1  implies  that  the  value  of  the  gain 
functions  evaluated  at  o>=0  is  1,  i.e.,  G(0)=1.  For  the  two  low-pass  filters  mentioned  above,  both 
gain  functions  in  equations  (3.14)  and  (3.15)  have  this  property.  Hence. 

fyy  ( 0 ) = ( 1 ) fuu  ( 0 ) • Therefore,  the  invariance  principles  (3.3)  and  (3.7)  are  the  same.  That 

is,  the  right-hand-side  in  equation  (3.16)  is  asymptotically  equivalent  to  the  first  term  on  the  right- 
hand-side  in  equation  (3.17).  Applying  the  low-pass  filter  eliminates  the  serial  dependence  bias, 

A21  • This  result  shows  that  the  filtered  Pqls  is  asymptotically  more  efficient  than  p ^ 

If  we  ignore  the  serial  dependence  bias,  A21,  then,  because  the  gain  functions  of  the 

low-pass  filters  shown  in  equations  (3.14)  and  (3.15)  are  bounded  at  one  over  -trcco  < re  , it  is 
easy  to  see  that  the  spectral  density  matrix  ^ ) -f^  ( o> ) is  positive  definite.  This  implies 


24 


that  v,  is  less  volatile  than  u,  , because  vt  has  weaker  short  run  dynamics.  Of  course,  short  run 
behaviors  are  irrelevant  asymptotically  in  estimating  the  cointegrating  vector  because  the  effects 
of  short  run  dynamics  die  out  asymptotically.  However,  in  finite  samples,  these  short  run  effects 
should  be  reflected  in  the  exact  distributions.  So,  given  the  identical  spectral  density  matrix  of  the 
error  terms  at  zero  frequency,  or  "long  run"  covariance  matrix,  the  cointegration  system  with  the 
weaker  short  run  dynamics  should  have  an  advantage  in  revealing  its  long  run  equilibrium. 
Therefore,  one  may  expect  an  efficiency  gain  in  finite  samples  by  applying  these  low-pass  filters. 
It  turns  out  that  this  is  a very  attractive  merit  of  using  the  filtered  LS  method.  In  particular, 
because  efficient  estimators  like  FMLS  and  SR  rely  on  first-stage  estimates  of  the  cointegrating 
vector,  which  are  conventionally  supplied  by  OLS,  the  fact  that  the  OLS  estimate  has  a second- 
order  bias  means  that  these  efficient  estimators  are  perhaps  not  as  efficient  as  possible  in  finite 
samples.  Therefore,  some  improvement  will  be  expected  by  using  the  filtered  LS. 

Proposition  3.2 


(3.18) 


where 


and 


25 


“11.2  W11  <*>21^22C*)21  ’ 

The  limit  distribution  in  equation  (3.18)  is  the  same  as  that  of  the  unfiltered  FMLS.  This 
implies  that  both  share  the  same  asymptotic  efficiency.  But.  given  the  reason  we  discussed  above, 
an  efficiency  gain  in  finite  samples  can  be  anticipated. 

Note  that  the  limit  distribution  in  equation  (3.16)  implicitly  contains  a unit  root  term, 
which  is  generally  nonnormal  and  dependent  upon  nuisance  parameters.  Consequently,  this 
invalidates  the  use  of  standard  distributions  for  hypothesis  tests  on  the  cointegrating  vector. 
However,  the  asymptotics  of  equation  (3.18)  suggest  that  hypothesis  testing  may  be  easily 
conducted  using  an  asymptotic  Chi-squared  criterion.  Suppose  that  we  have  the  following 
hypothesis  test  involving  g restrictions  on  (3  of  the  form 

R$FMLS~r  ' 

where  R is  a matrix  describing  the  g restrictions  and  r is  a vector.  The  Wald  statistic  is  in  the 
form 


^ V&Z  ( P /)  1 (R$FMLS~r) 

We  then  have  the  following  proposition: 


Proposition  3.3.3  Under  H0  , W=>x*(< l)- 


Fhe  asymptotic  theory  of  the  hypothesis  test  is,  therefore,  exactly  the  same  as  for 


conventional  FMLS. 


26 


3.4,  A Monte  Carlo  Study 

The  following  data  generating  process  (DGP)  is  based  on  Baneijee  et  al.  (1986),  Hansen 
and  Phillips  (1990),  and  Gonzalo  (1994).  We  use  this  DGP  in  order  to  study  the  finite  sample 
performances  of  these  proposed  estimators  on  the  dimensions  of  the  signal  to  noise  ratio,  serial 
correlation,  and  endogeneity: 

x„-px2t=ult,  (l-pL)uu  =elt,  (3.19a) 

ax„+  x2t  =u2t  (1-  Duj,  =e2t,  (3.19b) 


where 


[«lfc 

~iidN 

0 

1 o0 

e,. 

L 2tJ 

0 

o0  o2 

. 

Ip! <1,  ! 0 ! <1 


(3.20) 


In  the  simulation  we  set  p =0.2,  0.4,  0.6,  0.8,  0=0.3,  0.6,  and  the  signal  to  noise  ratio,  a.  equal 
to  0.3, 0.5, 1,2,5.  We  also  set  (3=2  and  a=0.‘  Sample  sizes  are  64  and  128.  In  all  our  simulations, 
we  generated  500  replications,  starting  with  ult  =0  and  u2t  =0,  and  then  discarded  the  initial  100 
observations. 

Intuitively,  the  idea  of  a low-pass  spectral  filter  method  is  to  remove  the  high  frequency, 
noise  component  by  filtering  and  then  estimate  a cointegration  equation  using  the  remaining  low 
frequency,  signal  component.  To  see  this,  we  rewrite  equation  (3.19)  (with  a = 0 and  (3=2  ) as 
the  following  reduced  form: 

xlt=  u)t  +2U*  , (3.21a) 


2 A similar  data  generating  process  with  a * 0 is  used  in  Li.  Maddala  and  Rush  (1995). 


27 


x2,=  Uj, . (3.21b) 

In  this  model  the  stochastic  trends  are  of  interest.  In  the  terminology  of  filtering  theory, 
we  treat  the  stochastic  trends  as  the  low  frequency  signals  and  the  cyclical  components  as  the  high 
frequency  noise.  Thus  we  can  rewrite  equation  (3.21)  as 

xt  = Sj+n,,  (3.22) 

where 


“it  ZU2C 

n ' st= 

0 [ u2t 

Next,  we  apply  a low-pass  filter  to  equation  (3.22).  In  this  simulation  study  we  use  only  the  low- 
pass  HP  filter,  though  it  would  also  be  possible  to  use  the  exponential  smoothing  filter.  We  fix 
A^lOO  in  our  HP  filter.  3 

Table  3.1  and  Table  3.2  summarize  the  performance  of  the  four  estimators.  Eighty 
parameter  settings  are  reported.  We  observe  several  general  results  in  these  tables.  First,  the  bias 
and  the  RMSE  of  OLS  estimation  increases  monotonically  as  the  signal  to  noise  ratio  decreases 
and/or  the  serial  correlation  increases.  Second,  comparing  Table  3.1  and  Table  3.2,  we  can  see 
that,  in  general,  the  estimates  of  OLS  and  FMLS  get  more  biased  and  dispersed  as  9 increases. 
Since  9 reflects  the  correlation  between  the  disturbances  in  the  model,  more  endogenous  impacts 
of  9 on  both  bias  and  the  dispersion  are  expected  when  the  correlation  increases.  Third,  FMLS 
performs  better  than  OLS.  In  fact,  in  all  cases  FMLS  has  a smaller  bias  and  a smaller  root  mean 
square  error  (RMSE)  than  OLS.  These  results  are  consistent  with  Hansen  and  Phillips’s 

To  apply  the  HP  low -pass  filter  to  data,  one  could  use  the  truncated  version  (at  some  fixed  lag 
p)  of  the  filter  expressed  by  equation  (3.13),  or,  alternatively,  the  finite  HP  low-pass  filter,  which  is  the 
solution  to  equation  (3.11).  In  this  study,  we  use  the  latter. 


28 


study(1990).  Fourth,  as  the  sample  size  increases  from  64  to  128,  both  the  bias  and  the  RMSE  of 
all  four  estimators  improve  in  all  parameter  settings.  This  demonstrates,  to  a certain  extent,  a 
consistency  with  the  asymptotic  properties. 

Define  the  bias  gain  as  the  difference  between  the  bias  of  the  unfiltered  estimates  and  the 
bias  of  the  filtered  estimates.  Define  the  RMSE  gain  similarly.  Then,  looking  at  Table  3,  which 
reports  the  160  bias  and  RMSE  gains,  the  following  observations  warrant  special  attention: 

(i)  All  the  gains  are  positive.  In  other  words,  filtering  improves  both  the  bias  and  RMSE 
performances  for  both  OLS  and  FMLS.  This  evidence  is  strongly  in  favor  of  the  low-pass  filter 
method.  It  is  worth  noting  that  changing  p.  the  speed  of  adjustment  parameter,  from  0.2  to  0.8 
does  not  alter  the  sign  of  the  gain  from  filtering.  Bewley  et  al.  (1994)  reported  in  their  simulation 
study  that  the  finite  sample  performances  of  some  cointegration  estimators  are  related  sensitively 
to  the  speed  of  adjustment.  In  a quite  similar  environment,  they  showed  that  with  moderate  speeds 
of  adjustment  (p=0.3,  0.6),  Johansen’s  ML  estimator  beats  Box-Tiao’s  canonical  estimator,  but 
with  a slow  adjustment  speed  (p=0.9),  the  reverse  is  true.  We  do  not  see  this  phenomena  for  using 
a low-pass  filter. 

(ii)  The  gains  (bias  and  RMSE)  from  filtering  decrease  as  the  signal  to  noise  ratios 
increase.  This  should  not  be  surprising.  In  the  low  signal  to  noise  ratio  cases,  significant  gains  are 
obtained  (in  some  cases  more  than  80  percent  of  the  bias  and  more  than  70  percent  of  the  RMSE 
have  been  reduced).  However,  in  the  cases  where  the  ratios  are  very  high,  such  as  a=5,  not  much 
gain  is  achieved.  Intuitively,  this  implies  that  if  the  observed  data  are  less  contaminated  by  high 
frequency  noise,  less  gain  will  be  achieved  by  filtering  out  the  noise. 


29 


Table  3.1:  Comparison  of  Bias  (RMSE) 


T=64 


CT 

0.3 

0.5 

1 

2 

5 

OLS 

FMLS 

F-LS 

F-FMLS 

OLS 

FMLS 

F-LS 

F-FMLS 

OLS 

FMLS 

F-LS 

F-FMLS 

OLS 

FMLS 

F-LS 

F-FMLS 


0.0439  (0.1023) 
0.0147  (0.0837) 
0.0095  (0.0752) 
0.0025  (0.0777) 

0.0684  (0.1349) 
0.0264  (0.1181) 
0.0237  (0.1068) 
0.0097  (0.1164) 

0.0935  (0.2225) 
0.0229  (0.2010) 
0.0314  (0.1776) 
0.0034  (0.2006) 

0.1678  (0.3666) 
0.0503  (0.3507) 
0.1062  (0.3457) 
0.0291  (0.3455) 


p=0.2 

0.0291  (0.0702) 
0.0116  (0.0692) 
0.0081  (0.0570) 
0.0038  (0.0629) 
p=0.4 
0.0412  (0.0922) 
0.0142  (0.0832) 
0.0125  (0.0711) 
0.0036  (0.0774) 

p=0.6 

0.0592  (0.1422) 
0.0202  (0.1281) 
0.0240  (0.1248) 
0.0092  (0.1267) 

p=0.8 

0.0979  (0.2350) 
0.0335  (0.2221) 
0.0630  (0.2292) 
0.0225  (0.2217) 


0.0161  (0.0380) 
0.0071  (0.0433) 
0.0035  (0.0295) 
0.0018  (0.0370) 

0.0160  (0.0536) 
0.0025  (0.0460) 
0.0027  (0.0436) 
0.0020  (0.0454) 

0.0304  (0.0687) 
0.0107  (0.0656) 
0.0152  (0.0618) 
0.0058  (0.0655) 

0.0525  (0.1242) 
0.0228  (0.1214) 
0.0348  (0.1212) 
0.0170  (0.1213) 


9=0.3 

0.0088  (0.0217) 
0.0033  (0.0206) 
0.0028  (0.0181) 
0.0012  (0.0197) 
0=0.3 

0.0112  (0.0258) 
0.0038  (0.0249) 
0.0037  (0.0204) 
0.0007  (0.0242) 
0=0.3 

0.0167  (0.0372) 
0.0044  (0.0347) 
0.0068  (0.0340) 
0.0003  (0.0344) 
0=0.3 

0.0309  (0.0684) 
0.0131  (0.0614) 
0.0204  (0.0630) 
0.0095  (0.0612) 


0.0039  (0.0095) 
0.0012  (0.0083) 
0.0011  (0.0075) 
0.0003  (0.0077) 

0.0053  (0.0121) 
0.0017  (0.0103) 
0.0017  (0.0092) 
0.0004  (0.0102) 

0.0090  (0.0189) 
0.0035  (0.0147) 
0.0040  (0.0146) 
0.0015  (0.0144) 

0.0144  (0.0303) 
0.0057  (0.0258) 
0.0094  (0.0276) 
0.0041  (0.0257) 


30 


Table  3.1:  (Continued) 


OLS 

0.0276 

FMLS 

0.0042 

F-LS 

0.0045 

F-FMLS 

0.0013 

OLS 

0.0358 

FMLS 

0.0067 

F-LS 

0.0141 

F-FMLS 

0.0026 

OLS 

0.0603 

FMLS 

0.0092 

F-LS 

0.0265 

F-FMLS 

0.0021 

OLS 

0.1233 

FMLS 

0.0325 

F-LS 

0.0784 

F-FMLS 

0.0224 

0.3  0.5 

p=0.2 

(0.0579)  0.0188  (0.0381) 
(0.0463)  0.0043  (0.0303) 
(0.0434)  0.0058  (0.0331) 
(0.0460)  0.0017  (0.0301) 
p=0.4 

(0.0684)  0.0228  (0.0475) 
(0.0482)  0.0043  (0.0436) 
(0.0516)  0.0075  (0.0347) 
(0.0459)  0.0011  (0.0400) 

p=0.6 

(0.2288)  0.0412  (0.0782) 
(0.0862)  0.0117(0.0640) 
(0.0913)  0.0209  (0.0638) 
(0.0843)  0.0072  (0.0634) 

p=0.8 

(0.2385)  0.0679  (0.1347) 
(0.1793)  0.0143  (0.1125) 
(0.2044)  0.0435  (0.1196) 
(0.1759)  0.0092(0.1123) 


T=128 


a 


1 


0.0095  (0.0195) 
0.0014  (0.0153) 
0.0024  (0.0146) 
0.0003  (0.0150) 

0.0121  (0.0258) 
0.0020  (0.0203) 
0.0041  (0.0204) 
0.0004  (0.0202) 

0.0196(  0.0406) 
0.0029  (0.0316) 
0.0086  (0.0318) 
0.0003  (0.0311) 

0.0373  (0.0753) 
0.0079  (0.0642) 
0.0239  (0.0678) 
0.0049  (0.0641) 


2 

0=0.3 

0.0057  (0.0113) 
0.0014  (0.0093) 
0.0017  (0.0083) 
0.0004  (0.0090) 
9=0.3 

0.0060  (0.0127) 
0.0008  (0.0104) 
0.0015  (0.0098) 
0.0003  (0.0104) 
9=0.3 

0.0106  (0.0213) 
0.0018  (0.0164) 
0.0048  (0.0173) 
0.0005  (0.0163) 
9=0.3 

0.0209  (0.0422) 
0.0055  (0.0336) 
0.0134  (0.0364) 
0.0037  (0.0335) 


5 


0.0031  (0.0058) 
0.0006  (0.0041) 
0.0007  (0.0038) 
0.0000  (0.0039) 

0.0038  (0.0067) 
0.0007  (0.0049) 
0.0012  (0.0047) 
0.0000  (0.0048) 

0.0056  (0.0103) 
0.0009  (0.0070) 
0.0025  (0.0074) 
0.0003  (0.0068) 

0.0101  (0.0187) 
0.0024  (0.0143) 
0.0065  (0.0159) 
0.0016  (0.0142) 


31 


Table  3.2:  Comparison  of  Bias  (RMSE)  T=64 

a 


0.3 


OLS 

0.0528  (0.1024) 

FMLS 

0.0179  (0.0720) 

F-LS 

0.0110  (0.0541) 

F-FMLS 

0.0020  (0.0459) 

OLS 

0.0684  (0.1236) 

FMLS 

0.0178  (0.0794) 

F-LS 

0.0189  (0.0731) 

F-FMLS 

0.0010  (0.0660) 

OLS 

0.1027  (0.1747) 

FMLS 

0.0260  (0.0999) 

F-LS 

0.0447  (0.1105) 

F-FMLS 

0.0062  (0.0870) 

OLS 

0.1946  (0.3295) 

FMLS 

0.0526  (0.2199) 

F-LS 

0.1131  (0.2476) 

F-FMLS 

0.0238  (0.2147) 

0.5  1 

p=0.2 

0.0354  (0.0637)  0.0204  (0.0377) 
0.0119  (0.0398)  0.0070  (0.0238) 
0.0083  (0.0365)  0.0050  (0.0240) 
0.0028  (0.0329)  0.0017  (0.0209) 
p=0.4 

0.0565  (0.0996)  0.0285  (0.0514) 
0.0208  (0.0685)  0.0084  (0.0314) 
0.0188  (0.0569)  0.0090  (0.0324) 
0.0070  (0.0466)  0.0017  (0.0256) 

p=0.6 

0.0763  (0.1379)  0.0475  (0.0783) 
0.0238  (0.0901)  0.0137  (0.0501) 
0.0330  (0.0835)  0.0194  (0.0523) 
0.0065  (0.0671)  0.0036  (0.0454) 

p=0.8 

0.1383  (0.2464)  0.0796  (0.1326) 
0.0423  (0.1673)  0.0285  (0.0919) 
0.0832  (0.1940)  0.0490(0.1019) 
0.0224  (0.1584)  0.0181  (0.0859) 


2 5 

9=0.6 

0.0134  (0.0239)  0.0073  (0.0138) 
0.0048  (0.0155)  0.0028  (0.0085) 
0.0037  (0.0143)  0.0019  (0.0071) 
0.0014  (0.0119)  0.0007  (0.0057) 
9=0.6 

0.0193  (0.0331)  0.0107  (0.0194) 
0.0060  (0.0195)  0.0036  (0.0120) 
0.0058  (0.0194)  0.0036  (0.0101) 
0.0012  (0.0161)  0.0010  (0.0077) 
9=0.6 

0.0279  (0.0507)  0.0143  (0.0260) 
0.0042  (0.0158)  0.0042  (0.0158) 
0.0061  (0.0160)  0.0061  (0.0160) 
0.0013  (0.0124)  0.0013  (0.0124) 
9=0.6 

0.0493  (0.0846)  0.0249  (0.0420) 
0.0181  (0.0583)  0.0063  (0.0259) 
0.0323  (0.0675)  0.0143  (0.0311) 
0.0125  (0.0530)  0.0025  (0.0260) 


32 


Table  3.2:  (Continued)  T=128 

a 

03  05  1 2 5 

p=0.2  9=0.6 

OLS  0.0359  (0.0603)  0.0298  (0.0467)  0.0149  (0.0254)  0.0095  (0.0155)  0.0050  (0.0078) 

FMLS  0.0074  (0.0263)  0.0072  (0.0228)  0.0033  (0.0122)  0.0023  (0.0105)  0.0011  (0.0035) 

F-LS  0.0094  (0.0291)  0.0097  (0.0238)  0.0043  (0.0116)  0.0030  (0.0080)  0.0015  (0.0038) 

F-FMLS  0.0016  (0.0199)  0.0022  (0.0182)  0.0007  (0.0094)  0.0009  (0.0077)  0.0003  (0.0024) 

p=0.4  0=0.6 

OLS  0.0507  (0.0794)  0.0351  (0.0548)  0.0214  (0.0344)  0.0132  (0.0220)  0.0073  (0.0119) 

FMLS  0.0101  (0.0352)  0.0056  (0.0245)  0.0042  (0.0151)  0.0028  (0.0115)  0.0013  (0.0055) 

F-LS  0.0182  (0.0392)  0.0119  (0.0285)  0.0076(0.0171)  0.0045  (0.0114)  0.0023  (0.0058) 

F-FMLS  0.0025  (0.0269)  0.0008  (0.0228)  0.0007  (0.0128)  0.0007  (0.0079)  0.0001  (0.0044) 

p=0.6  0=0.6 

0LS  0.0739  (0.1182)  0.0538  (0.0900)  0.0280  (0.0456)  0.0167  (0.0279)  0.0094  (0.0150) 

FMLS  0.0094  (0.0550)  0.0089  (0.0528)  0.0037  (0.0210)  0.0026  (0.0142)  0.0012  (0.0072) 

F-LS  0.0306  (0.0707)  0.0223  (0.0512)  0.0114  (0.0271)  0.0072  (0.0165)  0.0041  (0.0087) 

F-FMLS  0.0001  (0.0503)  0.0010  (0.0441)  0.0003  (0.0201)  0.0004  (0.0126)  0.0007  (0.0061) 

p=0.8  0=0.6 

OLS  0.1397  (0.2180)  0.0900  (0.1380)  0.0553  (0.0847)  0.0335  (0.0523)  0.0174  (0.0280) 

FMLS  0.0318  (0.1119)  0.0160(0.0656)  0.0132  (0.0466)  0.0071  (0.0265)  0.0043  (0.0146) 

F-LS  0.0881  (0.1568)  0.0549  (0.0978)  0.0359  (0.0625)  0.0207  (0.0370)  0.0111  (0.0202) 

F-FMLS  0.0135  (0.1016)  0.0088  (0.0641)  0.0093  (0.0421)  0.0043  (0.0245)  0.0024  (0.0129) 


33 


Table  3.3:  Bias  (RMSE)  Gain  from  Filtering 


a 

0.3 

0.5 

1 

2 

5 

T=64 

p=0.2 

0=0.3 

OLS 

0.0344  (0.0271) 

0.0210  (0.0132) 

0.0126  (0.0085) 

0.0060  (0.0036) 

0.0028  (0.0020) 

FMLS 

0.0122  (0.0060) 

0.0078  (0.0063) 

0.0053  (0.0063) 

0.0021  (0.0009) 

0.0009  (0.0006) 

p=0.4 

0=0.3 

OLS 

0.0447  (0.0281) 

0.0287  (0.0211) 

0.0133  (0.0100) 

0.0075  (0.0054) 

0.0036  (0.0029) 

FMLS 

0.0167  (0.0017) 

0.0106  (0.0058) 

0.0005  (0.0006) 

0.0031  (0.0007) 

0.0013  (0.0001) 

p=0.6 

0=0.3 

OLS 

0.0621  (0.0449) 

0.0322  (0.0174) 

0.0152  (0.0069) 

0.0099  (0.0032) 

0.0050  (0.0043) 

FMLS 

0.0195  (0.0004) 

0.0110  (0.0014) 

0.0049  (0.0001) 

0.0041  (0.0003) 

0.0020  (0.0003) 

p=0.8 

0=0.3 

OLS 

0.0616  (0.0209) 

0.0349  (0.0058) 

0.0177  (0.0030) 

0.0105  (0.0054) 

0.0050  (0.0027) 

FMLS 

0.0212  (0.0052) 

0.0110  (0.0004) 

0.0058  (0.0001) 

0.0036  (0.0002) 

0.0016  (0.0001) 

T=128 

p=0.2 

0=0.3 

OLS 

0.0231  (0.0145) 

0.0130  (0.0050) 

0.0071  (0.0049) 

0.0040  (0.0030) 

0.0024  (0.0020) 

FMLS 

0.0029  (0.0003) 

0.0026  (0.0002) 

0.0011  (0.0003) 

0.0010  (0.0003) 

0.0006  (0.0002) 

p=0.4 

0=0.3 

OLS 

0.0217  (0.0168) 

0.0153  (0.0128) 

0.0080  (0.0054) 

0.0045  (0.0029) 

0.0026  (0.0020) 

FMLS 

0.0041  (0.0023) 

0.0032  (0.0036) 

0.0016  (0.0001) 

0.0005  (0.0000) 

0.0007  (0.0001) 

p=0.6 

0=0.3 

OLS 

0.0338  (0.1375) 

0.0203  (0.0144) 

0.0110  (0.0088) 

0.0058  (0.0040) 

0.0031  (0.0029) 

FMLS 

0.0071  (0.0019) 

0.0045  (0.0006) 

0.0026  (0.0005) 

0.0013  (0.0001) 

0.0006  (0.0002) 

p=0.8 

0=0.3 

OLS 

0.0449  (0.0341) 

0.0244  (0.0151) 

0.0134  (0.0075) 

0.0075  (0.0058) 

0.0036  (0.0028) 

FMLS 

0.0101  (0.0034) 

0.0051  (0.0002) 

0.0030  (0.0001) 

0.0018  (0.0001) 

0.0008  (0.0001) 

34 


Table  3.3:  (Continued) 


0.3 


T=64 

OLS 

0.0418  (0.0483) 

FMLS 

0.0159  (0.0261) 

OLS 

0.0495  (0.0505) 

FMLS 

0.0168  (0.0134) 

OLS 

0.0580  (0.0642) 

FMLS 

0.0198  (0.0129) 

OLS 

0.0815  (0.0819) 

FMLS 

0.0288  (0.0052) 

T=128 

OLS 

0.0265  (0.0312) 

FMLS 

0.0058  (0.0064) 

OLS 

0.0325  (0.0402) 

FMLS 

0.0076  (0.0083) 

OLS 

0.0433  (0.0475) 

FMLS 

0.0093  (0.0047) 

OLS 

0.0516  (0.0612) 

FMLS 

0.0183  (0.0103) 

a 


0.5 

p=0.2 
0.0271  (0.0272) 
0.0091  (0.0069) 
p=0.4 
0.0377  (0.0427) 
0.0138  (0.0219) 

p=0.6 

0.0433  (0.0544) 
0.0173  (0.0230) 

p=0.8 

0.0551  (0.0524) 
0.0199  (0.0089) 

p=0.2 

0.0201  (0.0229) 
0.0050  (0.0046) 
p=0.4 
0.0232  (0.0263) 
0.0048  (0.0017) 

p=0.6 

0.0315  (0.0388) 
0.0079  (0.0087) 

p=0.8 

0.0351  (0.0402) 
0.0072  (0.0015) 


1 


0.0154  (0.0137) 
0.0053  (0.0029) 

0.0195  (0.0190) 
0.0067  (0.0058) 

0.0281  (0.0260) 
0.0101  (0.0047) 

0.0306  (0.0307) 
0.0104  (0.0060) 

0.0106  (0.0138) 
0.0026  (0.0028) 

0.0138  (0.0173) 
0.0035  (0.0023) 

0.0166  (0.0185) 
0.0034  (0.0009) 

0.0194  (0.0222) 
0.0039  (0.0045) 


2 

0=0.6 

0.0097  (0.0096) 
0.0034  (0.0036) 
0=0.6 

0.0135  (0.0137) 
0.0048  (0.0034) 
0=0.6 

0.0218  (0.0347) 
0.0029  (0.0034) 
0=0.6 

0.0170  (0.0171) 
0.0056  (0.0053) 
0=0.6 

0.0065  (0.0075) 
0.0014  (0.0028) 
0=0.6 

0.0087  (0.0106) 
0.0021  (0.0036) 
0=0.6 

0.0095  (0.0114) 
0.0022  (0.0016) 
0=0.6 

0.0128  (0.0153) 
0.0028  (0.0020) 


5 


0.0054  (0.0067) 
0.0021  (0.0028) 

0.0071  (0.0093) 
0.0026  (0.0043) 

0.0082  (0.0100) 
0.0029  (0.0034) 

0.0106  (0.0109) 
0.0038  (0.0001) 

0.0035  (0.0040) 
0.0008  (0.0011) 

0.0050  (0.0061) 
0.0012  (0.0011) 

0.0053  (0.0063) 
0.0005  (0.0011) 

0.0063  (0.0078) 
0.0019  (0.0017) 


35 


(iii)  The  gains  from  filtering  when  using  OLS  are  significantly  greater  than  the  gains  when 
using  FMLS,  especially  in  the  low  signal  to  noise  ratio  cases.  Theses  results  are  consistent  with 
the  asymptotic  property  discussed  in  Section  3.3.  Because  filtered  OLS  is  more  efficient  than 
OLS,  while  the  filtered  and  unfiltered  FMLS  share  the  same  asymptotic  distribution,  more 
efficiency  gains  from  OLS  are  expected. 

(iv)  In  general,  more  gains  are  obtained,  when  stronger  endogeneity  (in  the  sense  of  higher 
value  of  9)  is  present  in  the  DGP.  This  feature  is,  indeed,  attractive.  Generally  in  cointegrated 
systems,  the  regressors  are  endogenous  and  system  innovations  are  serially  correlated.  Along  with 
other  studies  (see  Hansen  and  Phillips  (1990)  and  Gonzalo  (1994)),  this  study  shows  that  in  finite 
samples  with  the  presence  of  endogeneity,  cointegrating  estimators,  including  some  efficient 
estimators  such  as  ML  and  FMLS,  suffer  from  large  bias  and  dispersion.  Therefore,  the  fact  that 
filtering  helps  overcome  this  drawback  is  a particularly  happy  achievement. 

In  summary,  this  simulation  study  shows  that  filtering  to  remove  high  frequency 
components  significantly  improves  OLS  and  FMLS  cointegrating  estimators.  The  gains  from 
filtering  are  significant  in  small  samples. 

3.5.  Conclusion 


We  have  developed  in  this  paper  a new  low-pass  filter  method  in  estimation  and  inference 
in  a cointegrated  system.  It  has  been  shown  that  the  Filtered  LS  estimator  is  more  efficient  than 
the  OLS  estimator,  and  that  the  Filtered  FMLS  estimator  inherits  all  the  advantages  of  the  FMLS 
semiparametric  treatment  of  regression  errors  and  also  inherits  the  desirable  asymptotic  properties 
of  FMLS.  The  filtering  method,  however,  also  shows  gain  in  both  bias  and  RMSE  reduction  in 
finite  samples. 

Note  that  maximum  likelihood  method  requires  a complete  specification  of  the  data 
generating  mechanism  for  the  errors.  FMLS  deals  with  this  issue  by  employing  a semiparametric 
method  while  the  SR  approach  uses  a nonparametric  approach,  in  which  a Fourier  transfoimation 
method  makes  it  unnecessary  to  specify  the  short  run  dynamics  which  disturb  the  estimation  of 
the  long  run  equilibrium.  Our  approach  is  based  on  data  transformation.  The  major  advantage  of 
this  transformation  is  its  noise  reduction  by  the  nonparametric  treatment  of  regression  errors.  The 
other  advantage  of  this  approach  is  that  we  may  conduct  the  filtering  process  and  regression  in 
the  time  domain  while  maintaining  the  intuition  from  the  frequency  domain.  The  method 


36 


suggested  in  this  paper  provides  an  alternative  and  convenient  way  to  focus  our  attention  on  a 
particular  band  of  frequencies  in  estimating  a cointegrated  system. 

In  developing  his  spectral  regression  method  for  cointegrated  system.  Phillips  (1991a) 
argued, " If  the  parameters  of  a model  can  be  separated  sensibly  into  the  coefficients  of  long-run 
relationships  on  the  one  hand  and  short-run  dynamics  on  the  other,  then  frequency  domain 
regressing  provides  a natural  method  for  the  efficient  estimation  of  the  long-run  coefficients".  This 
paper  extend  this  argument.  In  the  circumstances  Phillips  stated,  the  filter  method  provides  another 
appealing  method  for  the  efficient  estimation  of  the  long-run  coefficients  in  the  time  domain. 
Furthermore,  because  a set  of  low-pass  filter  estimators  (indexed  by  the  smooth  parameter,  X) 
share  the  same  asymptotic  distribution,  the  method  provides  an  additional  dimension  when  it 
comes  to  adjusting  the  ability  of  the  filter  to  remove  the  high  frequency  noise  component.  Some 
adaptive  filters  may  be  more  attractive,  at  least  in  finite  samples.  This  is  the  subject  of  our  future 
research. 


CHAPTER  4 

FILTERING  METHODOLOGY  AND  FIT  IN  DYNAMIC  BUSINESS  CYCLE  MODELS 

4.1  Introduction 

Two  main  research  topics  in  macroeconomics  and  monetary  economics  are  economic 
growth  and  business  cycles.  For  ease,  most  economists  try  to  separate  those  two  parts  of  the  data. 
Although  theoretically  we  can  always  define  "growth”  and  "cycle"  so  that  they  are  distinguishable, 
as  a practical  matter  it  is  difficult  to  distinguish  them  in  an  empirical  study.  Lucas  (1977)  defined 
the  business-cycle  phenomena  as  the  recurrent  fluctuation  of  output  about  trend  and  co-movements 
among  other  aggregate  time  series.  Fluctuations,  are  by  definition,  deviations  from  some  slowly 
varying  path.  Since  this  slowly  varying  path  increases  monotonically  over  time.  Prescott  (1986) 
labeled  it  the  growth  part  of  a changing  economy.  He  claimed  that  this  trend  is  neither  a measure 
nor  an  estimate  of  the  unconditional  mean  of  some  stochastic  process.  In  his  view,  it  is  best 
defined  by  some  computational  procedure  used  to  fit  a smooth  curve  through  the  data. 

Generally  two  methods  are  used  to  divide  data  into  a business  cycle  part  and  a growth 
part.  The  first  technique  simply  differences  the  data,  that  is,  it  uses  a difference  filter.  In  this  case, 
the  resulting  data  are  usually  stationary  and  these  data  are  taken  as  reflecting  all  the  statistical 
properties  of  the  business  cycle  component  of  the  data.  The  second  method  uses  a more 
sophisticated  filter  to  remove  the  low  frequency  (trend)  component  of  the  data.  The  most  popular 
filter  used  for  this  purpose  is  the  HP  filter.  The  HP  filter  was  first  introduced  by  Hodrick  and 
Prescott  (1980)  s seminal  paper,  which  though  unpublished,  is  widely  cited.  In  that  paper,  the 
authors’  original  purpose  was  to  decompose  macro-time  series  into  a growth  (or  trend)  component 


37 


38 


and  a cyclical  component.  The  basic  idea  is  that  without  very  much  prior  knowledge  about  the 
time  series  characteristics  of  the  data,  the  growth  component  should  be  the  "smooth"  part  of  the 
data.  According  to  this  point  of  view,  they  used  a moving  average  method  to  smooth  the  data, 
with  a parameter  X.  designed  to  be  set  by  the  researcher  to  "penalize"  variability  in  the  growth 
component  series.  Essentially  X specifies  how  smooth  the  growth  component  will  be.  After  the 
growth  component  is  identified,  the  cyclical  component  is  defined  as  the  residual  between  the 
actual  data  and  estimated  growth  component.  While  the  paper  provided  a new  method  to 
decompose  nonstationary  macro-time  series  data,  the  properties  of  this  filter  could  not  be  clearly 
seen. 

The  first  use  of  the  HP  filter  was  by  Kydland  and  Prescott  (1982)  in  their  pioneering  RBC 
paper.  This  paper  has  led  to  a vast  and  burgeoning  literature.  Hansen  (1985),  for  instance, 
modifies  the  basic  Kydland-Prescott  model  by  simplifying  it  a bit  and  introducing  the  idea  of  an 
indivisible  labor  input.  Cooley  and  Hansen  (1989)  introduce  money  into  this  class  of  models  by 
utilizing  a cash-in-advance  constraint.  Christiano  and  Eichenbaum  (1992)  modify  the  prototypical 
model  by  introducing  government  expenditure.  Kim  and  Loungani  (1992)  incorporate  effects  from 
oil  prices.  Other  related  work  includes  Cho  ( 1990),  Cho  and  Cooley  (1992),  Bils  and  Cho  (1993), 
and  Cho  and  Phaneuf  (1992).4 

Most  of  the  work  following  Kydland  and  Prescott  has  shared  one  common  factor:  both 
the  data  from  the  real  world  and  the  model-generated  data  are  filtered  through  an  HP  filter.  It 
certainly  makes  sense  to  filter  nonstationary  real-world  data  (like  output,  consumption  and 
investment)  because  their  standard  deviations  must  be  calculated.  Without  fiitenng,  due  to  the 


4 All  these  papers  concentrated  on  the  business  cycle  properties  of  the  model.  In  contrast.  King, 
Plosser.  and  Rebelo  (1988  I,  II)  examined  the  trend  properties.  Influenced  by  Engle  and  Granger  (1987), 
their  model  emphasized  the  point  that  output,  consumption,  effective  labor  input,  capital,  and  investment 
should  all  have  the  same  growth  rate;  that  is,  they  all  should  share  a common  trend. 


39 


nonstationary  nature  of  the  data,  these  statistics  would  be  a function  of  time,  so  it  would  be 
nonsensical  to  compare  them  with  anything.  But,  if  we  only  need  to  ensure  that  the  data  are 
stationary,  filtering  stationary  real-world  data  (like  working  hours)  and  model-generated  data 
seems  problematic.  Since  the  simulated  data  are,  by  construction,  nonlinear  functions  of  a white 
noise  and  sample  sizes  are  small,  it  is  very  difficult  to  test  their  stationarity.  Therefore,  without 
having  enough  information  about  the  simulated  data,  filtering  some  U.S  data  series  and  the 
simulated  series  might  lead  to  an  "over-differencing"  problem,  which  we  discuss  in  the  next 
section. 

While  it  is  not  easy  to  understand  this  filtering  methodology  in  the  time  domain,  Prescott 
( 1986)  offers  an  explanation  based  on  the  frequency  domain.  He  suggests  that  attaining  stationarity 
is  not  the  whole  purpose  of  using  the  HP  filter.  Instead,  by  adjusting  the  parameter  A.,  the  HP  filter 
can  remove  or  reduce  components  of  a time  series  below  any  specific  frequency.  This  is  a very 
attractive  view,  because  it  is  difficult  to  isolate  the  cyclical  movement  from  trend  in  the  time 
domain  if  the  trend  is  not  deterministic.  For  instance.  Stock  and  Watson  (1988)  point  out  that  if 
we  decompose  a (yt)  series  into  trend  part  (ytT)  and  stationary  part  (yts),  the  correlation  between 
the  trend  and  stationary  innovations  is  purely  a matter  of  a priori  beliefs,  and  different  time  series 
models  can  be  estimated  based  on  corresponding  beliefs.  The  observational  equivalence  of  these 
different  models,  for  instance.  ARIMA  and  UCARIMA.  reflecting  only  different  beliefs,  suggests 
a notable  advantage  of  frequency  analysis:  in  the  frequency  domain,  we  can  use  filters,  such  as 
the  HP  filter  suggested  by  Hodrick  and  Prescott,  to  unambiguously  define  and  remove  a trend  by 
removing  or  reducing  power  below  some  desired  frequency.  Prescott,  for  example,  suggests  that 
the  choice  of  A^=1600  removes  all  the  components  in  any  time  series  which  generate  changes 
whose  frequencies  are  lower  than  tc/16,  that  is,  the  cycles  of  the  changes  are  longer  than  8 yean;. 

For  business  cycle  research  our  interest  is  in  the  short-run  dynamics  of  economy, 


40 


therefore,  the  analysis  should  focus  on  the  high  frequency  fluctuations  of  the  important 
macroeconomic  time  series.  For  nonstationary  macroeconomic  time  series,  what  we  need  is  to 
extract  the  high  frequency  components  from  the  nonstationary  series  rather  than  from  their  first 
differenced  processes,  since  difference-detrending  method  would  distort  this  short-run  dynamics. 
For  example,  the  high  frequency  components  of  the  GNP  process  is  different  from  that  of  the 
"technology  shock",  which  is  usually  defined  as  the  first  difference  of  the  GNP  series.  Here,  a 
major  advantage  of  the  HP  filter  is  that  it  is  trend  specification  free. 

Since  the  HP  filter  was  introduced  and  widely  applied  in  business  cycle  study,  there  was 
a very  limited  literature  investigating  its  theoretic  and  empirical  properties  until  King  and  Rebelo 
( 1993)  appeared.  In  that  paper.  King  and  Rebelo  study  the  HP  filter  by  deriving  its  gain  function. 
Their  work  is  highly  valuable,  because  it  gives  us  important  theoretic  insights  into  the  HP  filter. 
We  are  still  left  with  some  empirical  issues,  such  as:  (1)  How  efficiently  does  the  HP  filter 
remove  the  low  frequency  components,  as  defined  by  Prescott  (1986);  (2)  What  is  the  impact  of 
the  conventional  method  of  filtering  both  the  real  data  and  artificial  data?;  and,  (3)  How  does  the 
efficiency  change  when  the  HP  filter  is  used  on  different  time  series,  e.g,  stationary  series, 
nonstationary  series  without  a deterministic  trend,  and  nonstationary  series  with  a deterministic 
trend? 

In  this  paper,  we  empirically  investigate  some  of  these  questions.  We  then  propose  a new 
procedure  to  carry  out  Prescott  s (1986)  belief  that  when  filtering  the  data,  the  investigator  should 
be  concerned  about  preserving  certain  frequencies  or  frequency  ranges.  With  this  new  procedure, 
we  can  check  the  fit  of  the  RBC  models  with  the  real  world  data  not  only  in  the  frequency  band 
suggested  by  Prescott  ( > tr/16),  but  in  any  suggested  frequency  band  width. 

In  section  II  we  analytically  describe  several  related  issues  of  filtering  methodology, 
briefly  review  basic  filtering  theory  in  the  time  and  frequency  domains,  and  then  turn  to  their 


41 


applications  in  empirical  studies.  In  section  III,  we  design  a set  of  Monte  Carlo  experiments  to 
illustrate  a number  of  issues  which  affect  the  effectiveness  of  the  HP  filter.  In  section  IV,  a new 
method  is  introduced  to  measure  the  variances  and  covariances  contributed  by  only  high  frequency 
fluctuations  of  time  series.  The  last  section  gives  our  comments  on  the  results. 

4.2  Discussion  of  Current  Filtering  Methodology  in  Real  Business  Cycle  Model 

4.2. 1 The  HP  Filter  And  Related  Theory 
The  basic  HP  objective  function  is 

Min  t [ (yt-yf)  2+A.  [ (yf+1-ytff)  - (yf-y^)  ] 2]  (4.1) 

t=1 

where  y,  is  the  data  series  and  yg,  is  the  growth  components  to  be  estimated.  From  the  first  order 
condition,  we  get  F(B)ygt  = y„  where  B is  the  backward  operator  and  F(B)  = [X(l-B)2(l-B'‘)2  + 
l]-5 6  To  analyze  the  properties  of  the  HP  filter.  King  and  Rebelo  (1993)  derive  the  inverse  of  F(B), 
F'CB),0  which  is  the  HP  trend  filter.  From  this,  they  get  the  analytic  form  of  the  HP  cyclical  filter 
(1-F‘(B)): 

C(B ) = - (1~g)  2 (1--B"1)  2 (42) 

1+  X(l-B)  2(1-B~1)2 

Rewrite  above  representation  as: 


5 For  the  details  of  how  this  result  is  derived,  see  King  and  Rebelo  (1993). 

6 For  the  details  of  why  F(B)  is  invertible  see  the  technical  appendix  of  King  and  Rebelo  (1993). 


42 


a*)- m-afti-g-y 

gy  (1  -01*X1  -0^(1  -0^  -l)(l  -0^  -1)  (4  3 

= 00  1-fl  l-.fi-1  1-fi-1 

1 2 1-0^  i-OjJJ  i-OjB1  i-q^1 

where  9,  and  02  are  the  complex  roots  of  F(B).  From  (4.2)  and  (4.3),  it  is  clear  that  the  HP  filter 
differences  the  data,  but  it  is  hard  to  see  how  the  filter  changes  the  data.  Therefore,  using  the  HP 
filter  to  difference  time  series  may  create  an  "over-differencing"  problem.  The  over-differencing 
problem  is  that  applying  a difference  operator  to  a stationary  series  will  dramatically  change  the 
statistical  properties  of  original  series,  though  the  output  series  will  still  be  stationary.7 
Concretely,  an  ARIMA(p.d.q)  series  can  be  differenced  d times  to  get  a stationary  ARMA(p,q) 
series.  If  it  is  differenced  more  than  d times,  the  result  is  not  an  ARMA(p,q)  stationary  series  but 
some  other  series.  For  instance,  an  ARMA(p,q)  process  yt  can  be  expressed  as 


mc 

*(fi)  • 


(4.4) 


where  9(B)  and  <t>(B)  are  polynomials  with  order  p and  q,  and  e,  is  white  noise.  The  stationary 
condition  for  this  ARMA  series  is  that  all  roots  of  <t>(  B)  lie  outside  of  unit  circle.  If  this  series 
is  differenced  n times,  the  new  series  will  be 


Another  issue  related  to  over-differencing  is  the  so-called  nonfundamental  problem.  Following 
Lippi  and  Reichlin  s (1993)  definition  of  a nonfundamental  series,  over-differencing  will  cause  a time  series 
to  have  a pseudo-moving  average  component  which  is  noninvertible.  Therefore,  the  white  noise  component 
of  the  series  will  not  be  able  to  be  expressed  as  linear  function  of  current  and  past  realizations  of  the 
observable  random  variables.  The  consequence  is  that  it  is  intractable  to  detect  the  statistical  properties  of 
the  white  noise  (like  variance  etc.)  based  on  current  available  information  in  conventional  time  series  theory. 
In  this  case,  to  study  and  estimate  this  time  series  will  be  very  difficult. 


43 


yt  = d-Wyt  = 


e (B) 

MB)  e' 


(4.5) 


This  new  series  is  still  stationary,  but  (l-B)n9(B)  has  n unit  roots  so  it  is  noninvertible.  If  a 
noninvertible  series  can  not  be  written  as  a pure  AR  series,  then  conventional  estimation  method 
will  get  biased  result.8  If  we  know  a time  series  is  1(d),  we  will  not  difference  it  more  than  d 
times.  But  not  knowing  for  sure  the  order  of  the  data,  and  also  not  knowing  the  order  of 
difference  to  which  the  HP  filter  corresponds,  the  over-differencing  remains  a potential  pitfall. 
Studying  the  HP  filter  in  the  time  domain  is  not  easy,  so  we  can  turn  to  the  frequency  domain. 

In  the  frequency  domain,  a time  series  X(t)  which  is  absolutely  integrable  in  (-<»,  °°),  that 


has  a Fourier  transformation,  Z,((D),  that  converges  in  t e (-°°,  °°),  where  Z,(o))  is  defined  as: 


and  hence  we  can  get  the  spectrum  distribution  function  rx(co)  = ||  Z,(co)  ||2.  When  X(t)  is  a 
stationary  stochastic  process,  its  autocovariance  function  can  be  represented  as: 


Choi  (1990)  and  Nelson  and  Plosser  (1982)  indicate  that  if  a time  series  has  a nontrivial  moving 
average  pan,  autoregression  will  generate  upward  biased  estimates.  Once  a time  series  has  been  over- 
differenced, one  might  not  be  able  to  reject  the  null  hypothesis  that  the  series  is  nonstationary  by  unit-root 
tests,  therefore,  more  difference  operations  may  be  conducted. 


is: 


f |X(f)|  dt  < » 


(4.6) 


(4.7) 


44 


y(t)  = / r,(«V ‘TUdu  (4.8) 

-71 

and  its  cross-covariance  generating  function  with  Y(t)  (also  stationary)  can  be  denoted  as:9 

It 

Y^(t)=  f E tZx(u)Zy(o>)}e‘™du  (4.9) 

-n 

In  many  cases,  a time  series  is  filtered  in  the  time  domain  by  some  filter  h(B),  and  if  the  filter 
satisfies  following  condition: 

h(E)=  E hjBJ  (4.10) 

-«• 

and 

E \hj\  < « (4.11) 

we  have  the  representation  of  this  filtering  relationship  between  input  series  X(t)  and  output  series 
Y(t)  in  the  frequency  domain  as: 


T(oo)  = G(u)Gfi35  <4-12) 

where  ry(to)  is  the  spectrum  distribution  function  of  output  series  Y(t)  and  T(co)  is  the  transfer 

function  of  the  filter  h(B).  T(to)  is  the  squared  norm  of  gain  function  G(co)  of  the  filter  calculated 
from: 


9 The  proof  of  (4.9)  is  found  in  Jenkins  and  Watts  (1968)  344-345. 


45 


r,(w)  = T(w)rx(o>)  (4.13) 

G(o)  = E hjeiu>i  (4.14) 

y— 

theoretically,  the  Fourier  transformation  s convergence  for  a nonstationary  time  series  is  not 
guaranteed  in  (-«,  °°)  since  (4.6)  may  not  be  satisfied  for  those  series.  But  empirically,  we  can 
do  a Fourier  transformation  for  any  time  series  since  in  a finite  (-T.  T)  time  span,  the  sample 
counterpan  of  (4.6)  is  always  finite.  Therefore  the  above  results  can  be  borrowed  if  we  change 
the  spectrum  distribution  functions  rx(to)  and  ry(co)  to  their  sample  versions 
f»  and  r((o)  • 

Specializing  now  to  the  case  of  an  HP  filter.  King  and  Rebelo  derive  the  gain  function 
for  the  HP  filter:10 


G^^tl-cosC-)]2  ,4.15) 

1 +4A[1  -cos(o))]2 

It  is  well  known,  the  sample  spectrum  of  an  integrated  process  with  order  d,  f^u)  ■ has  the 
approximate  shape  of  f^to)  ~ Aco  ‘d  when  to  approaches  zero  and  the  sample  size  is  large 
enough,  while  the  sample  spectrum  of  a stationary  series,  p ((,))  , has  the  property  of 


° 11  is  easy  t0  see  that  the  HP  filter  satisfies  (4.9)  and  (4.10)  from  King  and  Rebelo's  proof  that  F(B) 
is  invertible. 


46 


f’/w)  K 00  ‘ sPectrum  °f  the  input  series  be  p (q)=  f^oj)  • ^ *s  easy  t0  see  the 

spectrum  distribution  function  of  output  series  from  the  HP  filter  will  be: 

**,(«)  = G^(o))f»  ~ (4.16) 

(l+Xco4)2 

Clearly  for  to  very  close  to  zero,  as  long  as  d < 4,  p < M . So,  as  King  and  Rebeio 
demonstrate,  the  HP  filter  renders  stationary  data  which  is  integrated  up  to  the  fourth  order. 

Now.  we  turn  to  study  the  effect  of  the  HP  filter  on  high  frequency  components.  We 
define  these  as  frequencies  higher  than  tc/16.  so  that  their  corresponding  cycle  lengths  are  shorter 
than  8 years.  From  (4.15)  we  see  that  when  to  approaches  tc/2,  the  gain  function  of  the  HP  filter 
is  close  to  4X/0+4A.).  If  X is  large,  the  value  of  the  transfer  function  of  the  HP  filter  will  be  very 
close  to  1.  Therefore  the  high  frequency  components  of  the  time  series  on  which  the  filter  is  used 
are  untouched. 

Figure  4.1  draws  the  transfer  function  of  the  HP  filter  and  difference  filter  (1-B),  which 
is  also  widely  used  in  time  series  and  macroeconomic  studies.12  Figure  4.1  shows  that  the 
transfer  functions  of  the  HP  filter  with  different  values  of  X and  the  first  order  difference  filter  are 
monotonically  increasing  in  (0,  k),  but  the  former  rises  much  faster  in  the  early  stages  than  in  the 
later  stages.  Once  the  value  of  the  HP  filter  transfer  function  approaches  one,  it  becomes  very 
stable.  This  property  is  extremely  useful  in  preserving  high  frequency  components.  The  figure  also 
shows  that  with  decreasing  X , the  HP  filter  transfer  function  shifts  to  the  right.  This  indicates  that 
the  ability  of  HP  filter  to  retain  high  frequency  components  varies  with  the  value  of  X.  Figure  4. 1 

11  See  Engle  and  Granger  (1987), 

: The  transfer  function  of  difference  filter  is  TD(co)  = 2[  1 -cos(co)]. 


47 


48 


also  implies  a fact  worth  noting,  that  the  sample  periodogram  of  a difference-stationary  process 
is  different  from  the  frequency  distribution  of  its  first  differenced  process.13 

Table  4.1  lists  some  critical  values  of  the  HP  filter’s  transfer  function.  When  X is  higher 
than  10,000,  the  power  ot  the  input  time  series  in  the  frequencies  between  rc/16  and  7C/8  will  be 
preserved  over  90%;  if  X^1600,  on  average  only  65%  of  the  power  of  the  frequencies  between 
rc/16  and  rc/12  will  be  retained;  and  taking  X=400  cuts  even  more  high  frequency  components. 

In  contrast,  the  first  order  difference  filter  changes  the  power  of  almost  every  frequency 
component  of  the  time  series  it  is  working  on.  Indeed,  after  being  differenced,  some  high 
frequency  components  of  the  data  are  enlarged  up  to  four  times  of  their  original  level. 

Because  the  HP  filter  can  make  stationary  almost  any  data  series  encountered  in  applied 
work,  it  is  widely  used  in  real  business  cycle  models.  In  use,  the  HP  filter  is  used  to  filter  out  the 
low  frequency  components  (while  retaining  the  high  frequency  components)  for  both  the  actual 
and  the  simulated  economic  time  series  before  calculating  variances  and  covariances.  We  can  see 
two  possible  reasons  for  filtering  the  simulated  data:  (1)  The  data  may  be  nonstationary;  and  (2) 
even  if  stationary,  the  data  may  contain  considerable  low  frequency  components  which  may 
constitute  a large  part  of  total  variance.  So,  when  trying  to  look  at  business  cycle  frequencies  it 
may  not  be  appropriate  to  be  content  with  a stationary  series  if  it  contains  a large  measure  of  low 
frequency  components. 

While  HP  filtering  stationary  series  is  legitimate,  as  a practical  matter  one  may  imagine 
that  using  the  same  filter  on  data  that  have  dramatically  different  time  series  characteristics  is 
questionable.  From  (4.13),  we  see  that  the  spectrum  of  output  series  not  only  depends  on  the 

Similar  comments  about  first-difference  filter  have  been  made  by  Baxter  (1994). 


49 


filter’s  transfer  function,  but  also  on  the  spectrum  of  the  input  series.  If  we  have  some 
requirements  tor  the  spectrum  of  the  output  series — such  as  little  or  no  power  for  the  frequencies 
below  a certain  frequency—  the  filter  should  be  designed  for  the  specific  input  series.  Thus  there 
is  no  unique  optimal  filter  which  will  perform  well  for  all  input  series.  Hence  the  question 
becomes  whether  the  RBC  models  can  generate  a data  set  that  is  close  enough  to  the  real  world 
data  to  allow  us  to  use  the  same  HP  filter  (here  by  same  filter  we  mean  the  same  value  of  X)  to 
get  empirically  satisfactory  results.  In  the  Kydland-Prescott  style  RBC  models,  X is  commonly 
assigned  a value  of  1600  for  quarterly  data.  Since  picking  the  value  of  iUl600  is  arbitrary,  the 
robustness  of  calculated  standard  deviations  and  correlations  is  also  open  to  question. 

4.2.2  Results  From  An  RBC  Example  And  Actual  Data 

Here,  we  take  the  indivisible  labor  input  real  business  cycle  model  of  Hansen  (1985)  as 
a bench  mark  to  check  the  effect  of  HP  filtering.  We  briefly  describe  Hansen’s  model  in  the 
appendix.  To  be  comparable,  we  use  the  same  calibration  as  Hansen  (1985),14  which  is  widely 
applied  in  RBC  literature. 

First,  we  check  the  effect  of  HP  filtering  U.S  data  graphically.  In  figure  4.2. 1 to  4.2.4,  we 
show  the  frequency  power  distributions  of  U.S  output,  consumption,  investment  and  employment 
data  before  and  after  HP  filtering.  Figure  4.2.1(a)  to  4.2.4(a)  provide  the  power  distribution  of  all 
trequencies.  We  see  that  for  output,  consumption  and  investment,  when  their  frequencies  are  very 
close  to  zero,  their  power  tends  to  grow  without  limit.  This  indicates  that  those  time  series  are 

Because  our  data,  which  we  obtained  from  Mark  Watson,  extends  over  128  quarter  rather  than  115 
initially  used  by  Hansen,  we  had  to  enlarge  slightly  the  variance  of  technology  shock  (from  0.00712  to 
0.0089)  to  match  the  sample  standard  deviation  of  output. 


50 


nonstationary.  Employment,  however,  is  different.  It’s  spectrum  declines  as  the  frequency  goes 
to  zero,  which  is  indicative  of  stationarity. 

We  see  in  the  figures  that  the  HP  filter  effectively  cuts  the  low  frequency  components. 
But,  the  scale  of  the  figures  makes  it  hard  to  see  the  filter’s  effect  on  the  higher  frequency 
components.  Hence,  Figure  4.2.1(b)  to  4.2.4(b)  show  the  high  frequency  part  of  the  spectra.  From 
here  it  appears  that  the  HP  filter  not  only  reduces  low  frequency  components  but  also  cuts  the 
high  frequency  components  from  the  nonstationary  series,  like  output,  consumption  and 
investment.  This  is  not  expected  from  our  theoretical  analysis.  For  employment,  which  is 
apparently  more"  stationary,  as  expected  the  HP  filter  has  no  impact  on  high  frequencies. 

Two  questions  arise  from  these  results:  (1)  Are  the  spectra  we  estimated  from  a sample 
of  a nonstationary  series  close  enough  to  the  real  spectra?  (2)  Is  the  relationship  represented  by 
equation  (4.13)  still  true  when  the  input  spectrum  is  from  a nonstationary  series?  We  shall  return 
to  these  points  in  detail  in  part  III. 

Figure  4.3.1  to  4.3.4  show  the  power  spectra  for  simulated  data15  before  and  after  HP 
filtering.16  Figure  4.3.1(a)  to  4.3.4(a)  present  spectra  for  all  frequencies  and  Figure  4.3.1(b)  to 
4.3.4(b)  show  the  high  frequency  part  of  the  spectra.  We  see  that  the  model-simulated  data  have 
the  characteristic  of  a nonstationary  series:  the  power  does  not  tend  to  a finite  number  as  the 
frequency  approaches  zero.  For  all  the  simulated  data,  however,  contrary  to  the  U.S  data,  the  HP 
filter  preserves  the  high  frequency  components  quite  well. 

Figure  4.3  and  Figure  4.4  provide  overall  pictures  about  how  the  HP  filter  works  for  the 

5 Like  Hansen,  we  simulate  the  model  100  times.  In  every  simulation,  we  transform  the  simulated  data 
to  the  spectrum.  The  final  results  are  the  average  of  the  100  spectrum. 

To  be  comparable,  we  choose  X=  1600,  the  value  widely  used  by  the  RBC  literature  for  quarterly  data. 


51 


different  U.S  data  and  simulated  data.  But  to  see  exactly  how  much  power  has  been  eliminated 
from  those  time  series,  the  graphs  are  not  precise  enough.  Table  4.2  lists  the  second  moments  of 
the  U.S  data  and  of  the  simulated  data  before  and  after  HP  filtering. 

The  data  are  provided  by  Mark  W.  Watson,  and  cover  the  period  of  57.1-  88.4.  "Std" 
refers  to  the  standard  deviation  of  the  variable;  "Corr"  to  the  correlation  between  the  variable  and 
output  In  the  parentheses  are  the  standard  deviations  of  the  second  moments  from  simulated  data. 
It  is  obvious  that  the  standard  deviations  and  correlations  from  both  the  U.S  data  and  simulated 
data  are  changed  dramatically  by  the  HP  filter.  Those  changes  themselves  can  not  be  interpreted 
as  evidence  that  the  HP  filter  distorts  the  statistical  properties  of  the  data.  What  we  care  about  is 
whether  the  changes  in  the  second  moments  are  solely  because  the  HP  filter  reduces  the  power 
of  the  low  frequency  components  of  the  series,  or  because  of  reductions  in  the  power  of  the  high 
frequency  parts.  If  it  is  the  former  effect  that  is  operating,  then  applying  the  HP  filter  to  U.S  and 
simulated  data  is  acceptable,  and  from  Table  4.2  we  may  conclude  that  the  U.S  data  has  more 
powerful  low  frequency  parts  than  that  of  the  simulated  data.  But  if  it  is  the  latter  effect,  then  the 
HP  filter  is  introducing  undesirable  distortions. 

We  see  from  the  work  in  this  section  that  three  points  are  clear;  (1)  From  the  transfer 
function,  the  HP  filter  preserves  high  frequency  components  (cycles  shorter  than  4 years)  very 
well  when  it  is  used  to  eliminate  the  low  frequency  components,  while  the  difference  filter 
changes  the  power  distribution  for  every  frequency;  (2)  The  HP  filter  reduces  the  power  of  low 
frequency  components  of  both  U.S  data  and  model  simulated  data;  (3)  After  HP  filtering,  the 
standard  deviations  and  correlations  have  relatively  large  changes  and  are  usually  smaller  than 


before. 


52 


F.gUre  4.2.1  (a)  Spectra  of  U.S.  Output  Sene,  All  Frequence,  Fi gure  4.2.2(a)  Spectra  of  U.S.  Consumpt.on  Sene,  All  Frequences 


33 

(N 


Figure  4.2.4(a)  Spectra  of  U.S.  Employment  Senes.  All  Frequencies 


53 


CN 

O 


u/27T  frequency  in  cycles  per  quarter 
Figure  4.2.1(b)  Spectra  of  U.S.  Output  Senes.  High  Frequencies 


CN 

O 


w/ 2n  frequency  in  cycles  per  quarter 
Figure  4.2.2(b)  Spectra  of  U.S.  Consumpuon  Senes.  High  Frequencies 


u/2n  frequency  in  cycles  per  quarter 
Figure  4.2.3(b)  Spectra  of  U.S.  Investment  Series,  High  Frequencies 


Figure  4.2.4(b)  Spectra  of  U.S.  Employment  Series.  High  Frequencies 


54 


Figure  4.3.1(a)  Spectre  of  Simulated  Output,  All  Frequences 


a 

a 


* 

o 

a 


without  filter 
with  filter  A = 1 600 


a 

a 

o 


|.0Q  0.05  0.1 0 0.15  0.20  0.25  0.30  0.35  0.40  0.45  0.50 
w/27T  frequency  in  cycles  oer  quarter 
Figure  4.3.2(a)  Spectra  of  Simulated  Consumption.  All  Frequences 


Figure  4.3.3(a)  Specre  of  Sunulated  Investment.  All  Frequences  Figure  4.3.4(a)  Spectra  of  Simulated  Employment.  All  Frequences 


55 


3) 

CN 

O 


56 


There  are.  however,  questions  and  problems  remaining.  First,  Figure  4.2  shows  that  the 
HP  filter  not  only  cuts  low  frequency  components  but  also  reduces  some  high  frequency 
components  from  U.S  output,  consumption  and  investment  series.  Second,  since  the  value  of  X 
is  picked  relatively  arbitrarily,  we  would  like  to  know  how  robust  those  standard  deviations  and 
correlations  are  for  different  values  of  A..1  The  next  section  explores  these  issues. 


4,3.  Monte  Carlo  Experiments 


In  the  previous  section  we  have  shown  the  performance  of  the  HP  filter  on  US  data  and 
simulated  data  In  order  to  investigate  why  the  HP  filter  apparently  had  such  a large  effect  on 
some  of  the  calculated  statistics,  a Monte  Carlo  experiment  using  stationary  and  nonstationary 
process  is  instructive.  The  models  chosen  are  an  AR(  1)  process,  a random  walk,  and  a UCARIMA 
model.  The  AR(1)  model  is 

(l-0jJB)yf=ef 

where  9,  =0.7  and  et  ~ iid  NfO,  0.0212). 

The  random  walk  process  is  in  the  form  of 

(l-fl)yf=p+e( 

where  u=0,  e,  ~ iid  NfO,  0.052). 

The  third  model  is  the  UCARIMA(2,1,0)  model  employed  by  Watson  (see  Stock  and 


There  is  a potential  third  issue:  Using  the  same  filter  on  different  time  series  may  leave  different 
levels  of  low  frequency  power  for  different  filtered  series.  An  obvious  question  then  is,  as  an  empirical 
issue,  how  severe  is  this  problem?  Our  results  in  the  next  section  cast  some  light  on  this  issue,  but  we  do 
not  examine  it  at  length. 


57 


Watson  (1988))  to  fit  U.S  output  data  from  1949:1  - 1984:IV.  This  features  both  a deterministic 
trend,  u,  and  a stochastic  trend,  plus  a stationary  AR(2)  component.  The  innovations  of  the 
stochastic  trend  and  stationary  component  are  uncorrelated  by  assumption.  The  model  is 

y,  = yp  + y,’> 
y?  = n+  y,p- 1 + ef, 
y>  = VrV  + < 

where  p=0.008,  9,=  1.5,  9:=-0.6  and 


~ iid  N 


0.00572 

0 


0 

0.00762 


Figure  4.4  depicts  several  sets  of  spectra  of  filtered  and  unfiltered  series  generated  from 
these  three  processes.  We  derive  the  theoretical  non-zero-frequency  spectra  for  those  time  series 
in  appendix  B.  For  the  purpose  of  having  sample  sizes  close  to  that  of  the  post  war  U.S.  quarterly 
data,  we  set  the  sample  size  at  128.  To  show  the  HP  filter’s  performance,  we  set  X equal  to 
10,000,  1600,  400  and  100.  The  spectra  were  estimated  by  averaging  100  sample  periodograms. 

Figure  4.4(a)  clearly  shows  that  for  the  AR(1)  process  the  estimated  spectrum  nicely  fits 
the  theoretic  spectrum.  For  this  AR(1)  model,  there  exist  a medium  volume  of  low  frequency 
components  (less  than  or  equal  to  0.0625),  that  may  be  considered  undesirable  for  the  purpose  of 
business  cycle  analysis.  Therefore,  it  may  be  reasonable  to  detrend  the  stationary  time  series 
before  computing  any  statistics.  The  figure  shows  that  the  HP  filter  significantly  reduces  the  power 
of  the  low  frequencies  without  hurting  the  (desirable)  high  frequency  components  very  much. 
Also,  the  power  of  the  low  frequency  components  monotonically  decreases  as  the  smoothing 
parameter  decreases.  Apparently,  an  HP  filter  with  \=1600  does  not  necessarily  eliminate  all  low 


58 


Figure  4.4(b)  Power  Spectra  of  a Random  Walk  Process 


Figure  4.4(c)  Power  Spectra  of  a UCARIMA(2.1.0)  Process 


59 


frequencies.  The  spectrum  of  the  filtered  series  with  X=  10.000  is  closer  to  the  theoretical 
spectrum  than  that  of  the  filtered  series  with  fc=  1600  for  the  frequencies  from  7C/16  to  k/ 8 while 
almost  the  same  for  the  other  high  frequencies. 

Figure  4.4(b)  shows  the  results  from  the  random  walk  process.  The  following  points 
should  be  clear:  First  the  estimated  spectrum  of  the  unfiltered  series  is  higher  than  the  theoretical 
spectrum  over  (0,  tu].18  This  evidence  reveals  the  fact  that  there  exists  substantial  power  leakage 
when  directly  estimating  a spectrum  for  a nonstationary  process.  Due  to  the  nature  of  the  discrete 
methodology  used  to  estimate  the  sample  spectrum,  if  the  spectrum  has  a large  peak  at  some 
frequency,  then  it  is  possible  that  some  of  the  power  of  this  frequency  will  "leak"  into  the  estimate 
of  the  spectrum  at  other  frequencies.19  The  presence  of  the  leakage  implies  that  inferences  based 
on  the  spectrum  directly  transformed  from  a nonstationary  time  series  are  questionable.  Second, 
the  spectrum  from  the  HP  filtered  series  shows  a close  fit  to  the  theoretical  spectrum  for  all  high 
frequencies  greater  than  0.2?:.  The  HP  filter  also  significantly  reduces  the  power  of  low  frequency 
part.  And,  as  in  figure  4.4(a),  the  smaller  the  smoothing  parameter,  the  stronger  the  filtering 
ability.  Third,  the  first  difference  filter  amplifies  the  power  for  high  frequencies,  while  reducing 
the  spectral  power  for  low  frequencies.  The  resulting  spectrum  diverges  from  the  theoretical 
spectrum.  This  implies  that  the  first  difference  filter  may  not  be  an  appropriate  device  to  achieve 
stationarity  for  the  purpose  of  spectral  analysis. 

The  power  spectra  of  the  filtered  and  unfiltered  series  from  the  UCARIMA(2,1,0)  process 
are  presented  in  figure  4.4(c).  This  graph  illustrates  some  useful  points.  First,  adding  a 

8 To  make  reasonable  the  scale  of  the  comparison  between  the  spectra  of  the  empirical  sample  data  and 
the  theoretical  spectra,  we  skip  the  zero  frequency. 

19  For  the  detailed  discussion  of  this  issue,  see  Jenkins  and  Watts  (1968). 


60 


deterministic  trend  (p=0.008)  drastically  shifts  the  estimated  spectrum  upward.  This  suggests  that 
the  addition  of  a linear  trend  strongly  increases  the  leakage  problem.20  Second,  the  severe  leakage 
problem  can  be  overcome  by  filtering  to  achieve  stationarity.  The  spectra  of  the  HP  filtered  series 
fits  the  theoretical  spectrum  for  high  frequencies  (greater  than  0.2tc)  but  at  the  cost  of  reducing 
the  power  of  the  spectra  at  low  frequencies  (less  than  0.27c).21  Third,  the  first  difference  filter 
cuts  too  much  power  for  frequencies  less  than  0.357c  and  exaggerates  the  power  for  the  frequencies 
larger  than  0.357L 

Table  4.3  summarizes  the  results  from  our  Monte  Carlo  experiments.  Examining  the  table, 
the  following  points  are  clear.  First,  it  is  reassuring  to  note  that  the  standard  deviations  of  the 
unfiltered  senes  calculated  from  the  frequency  domain  closely  match  those  computed  from  the 
time  domain  for  all  three  models.  However,  when  we  compute  the  standard  deviation  in  the 
frequency  domain  excluding  zero  frequency — which  is  perhaps  appropriate  for  non-stationary 
processes  since  theoretically  the  power  at  zero  frequency  for  non-stationary  processes  is 
infinite — we  see  that  the  standard  deviations  for  the  random  walk  and  UCARIMA  processes  are 
both  substantially  lower  than  the  sample  standard  deviations.  This  strongly  indicates  that  the 
sample  standard  deviations  of  the  nonstationary  process  do  not  represent  the  actual  variation 
attributed  to  cycles  with  finite  cycle  lengths  very  well.  Second,  the  HP  filter  reduces  the  standard 
deviations  and  the  larger  X,  the  less  is  the  reduction  in  variation.  Combining  this  with  Figure  4, 

A simulation  study  by  Granger  (1964)  shows  that  the  parameter  size  of  a linear  trend  is  the  key  factor 
which  affects  the  leakage.  The  upward  bias  of  the  estimated  spectrum  is  more  severe  if  a stronger 
deterministic  trend  is  included. 

h is  worthwhile  to  note  that  figure  4(b)  shows  how  the  spectrum  of  the  HP  filtered  series  with 
X=  10.000  is  quite  close  to  the  theoretical  spectrum  for  the  frequencies  from  0.06257T  to  0.2n  and  also  has 
almost  identical  power  for  the  frequencies  greater  than  0.2n.  This  implies  that  an  HP  filter  with  X equal  to 
or  larger  than  10,000  may  be  appropriate  to  serve  as  a filter  to  achieve  stationary. 


61 


we  conclude  that  the  decreasing  size  of  the  standard  deviations  in  all  three  models  is  mainly 
attributed  to  the  reductions  in  the  power  of  the  frequencies  less  than  0.2k.  Third,  the  standard 
deviations  computed  from  the  first  differenced  series  are  smaller  than  those  computed  from  the 
HP  filtered  series  except  for  h=  100.  This  is  the  result  of  fact  that  the  first  difference  filter  cuts  too 
much  power  for  frequencies  less  than  0.3k  while  it  exaggerates  the  power  for  frequencies  greater 
than  0.3k. 

Finally,  for  business  cycle  analysis,  economists  usually  pay  attention  only  to  economic 
fluctuations  with  cycle  lengths  less  than  8 years.  From  Table  4.3.1  and  4.3.2,  we  see  that  the 
differences  between  the  filtered  standard  deviations  and  their  population  counterparts  become 
markedly  smaller  as  we  switch  our  focus  to  cycles  less  than  8 years.  Indeed,  the  filtered  series 
with  h=  10,000  are  very  close  to  their  population  counterparts.  This  suggests  that  the  HP  filter  with 
h=  10,000  or  larger  may  be  able  to  capture  the  subset  of  time  series  variation  that  most  economists 
associate  with  cyclical  fluctuations. 

These  results  give  us  an  answer  to  the  first  question  raised  in  the  previous  section.  It 
seems  clear  that  the  spectra  of  the  nonstationary  data  are  different  than  expected  because  the 
estimated  spectra  are  distorted  due  to  their  nonstationarity.  In  particular,  the  power  at  zero 
frequency  spills  over  to  other  frequencies.  The  HP  filter  with  iUl600  does  reduce  the  power  of 
the  low  frequency  components  but  based  on  Table  4.3.2  it  seems  clear  that  the  HP  filter  with  X 
equal  to  1600  might  not  be  the  best  choice  for  the  purpose  of  approximating  the  high  frequency 
part  of  the  spectra.  It  is  worth  of  noting  that  for  the  frequencies  from  0.0625k  to  0.2k,  a better 


62 


fit  is  achieved  by  imposing  a larger  X.22 

However,  one  question  remains:  How  does  the  choice  of  X affect  the  fit  of  calibrated  RBC 
models?  We  turn  to  this  in  the  next  section. 

4.4  Fit  of  an  RBC  Model 


We  know  that  the  larger  the  value  of  X.  the  more  the  HP  filter  can  preserve  the  power  of 
high  frequency  components  when  it  is  used  to  detrend  time  series.  Does  the  choice  of  X affect, 
in  a meaningful  way,  the  calculated  second  moments  from  U.S  data  and  the  model  simulated  data? 
That  is,  how  robust  are  those  second  moments  to  the  parameter  used  in  the  HP  filter?  To  answer 
this  question.  Table  4 lists  standard  deviations  and  correlations  of  variables  with  the  output  series 
from  filtered  U.S  data  and  model  simulated  data  with  different  X's.  This  table  shows  us  that  the 
standard  deviations  and  the  correlations  calculated  by  the  traditional  method  are  not  particularly 
robust  to  the  value  of  the  parameter  used  in  the  HP  filter.  For  the  actual  U.S  data,  the  differences 
between  the  standard  deviations  from  the  time  series  filtered  with  h=  10.000  and  those  from 
filtered  with  A^400  are  a factor  of  about  50%.  The  correlations,  though,  are  relatively  stable. 
Similar  results  are  observed  with  the  simulated  data. 

These  results  raise  a couple  of  questions:  (1)  Is  the  measurement  of  business  cycle 
fluctuations  of  the  U.S.  economy  obtained  by  HP  filtering  dependable,  since  the  second  moments 
of  U.S.  data  are  variable  when  different  values  of  the  smoothing  parameter  are  used?  (2)  Is  the 

2 Indeed,  to  deal  with  the  problem  of  excess  power  from  the  undesirable  low  frequencies  left  over 
by  using  same  filter  with  different  time  series,  it  is  perhaps  better  to  calculate  the  statistics  in  the  frequency 
domain  where  we  can  include  only  the  desired  frequency  variations. 


63 


matching  of  the  levels  of  the  relevant  standard  deviations  from  the  simulated  data  and  actual  U.S 
data  sensitive  to  the  value  of  XI  Combining  these  with  the  fact  that  the  value  of  X is  picked 
arbitrarily  to  some  degree,  the  legitimacy  of  judging  the  model’s  fit  based  on  (only)  X=  1600  is 
questionable. 

As  both  a theoretical  proposition  and,  as  Table  4 demonstrates,  a practical  fact,  the  larger 
the  value  of  X,  the  better  the  HP  filter  preserves  the  high  frequency  components.  But  the  larger 
the  value  of  X,  the  more  low  frequency  components  will  be  included  in  the  standard  deviations 
and  correlations  if  one  calculates  those  statistics  in  the  time  domain.  This  presents  an  obvious 
dilemma  because  when  studying  business  cycles,  we  do  not  want  long  frequency  data.  The  basic 
problem  arises  because  we  are  not  be  able  to  identify  the  components  of  different  frequencies  in 
the  time  domain.  In  the  frequency  domain,  however,  we  can  easily  identify  the  components  of 
different  frequencies,  and  calculate  the  standard  deviations  and  correlations  including  only  the 
power  of  the  frequencies  we  want  to  investigate.  Hence,  we  propose  the  following  procedure  to 
calculate  the  standard  deviations  and  correlations  in  an  RBC  framework. 

First,  use  the  HP  filter  with  X^=  10.000  to  filter  time  series.  The  major  purpose  of  this 
filtering  is  to  detrend  the  time  series  before  we  calculate  their  statistics  in  order  to  limit  the  severe 
"leakage"  problem  with  its  attendant  distortion  of  the  spectra  that  arises  when  the  data  are  non- 
stationary.'3 Choosing  X=  10.000  means  that  over  90%  of  the  desirable  power  from  high 
frequency  components  is  retained.  Of  course,  by  using  10,000,  quite  a bit  of  undesirable  low 
frequency  power  is  also  included,  but  this  will  be  excluded  when  we  calculate  the  second 

Since  we  filter  the  data  to  be  rid  of  the  leakage  problem,  filtering  stationary  time  series  is  not 
necessary.  However,  considering  that  empirically  no  one  can  be  sure  about  the  stationarity  of  the  data,  and 
keeping  in  mind  the  fact  that  the  HP  filter  does  not  eliminate  high  frequency  components,  filtering  all  the 
data — both  non-stationary  and  those  that  may  be  stationary — before  moving  to  next  step  does  no  real  harm. 


64 


moments. 

Second,  transform  the  filtered  series  to  the  frequency  domain.  The  resulting  sample  spectra 
should  be  very  close  to  the  true  ones,  especially  in  the  high  frequency  part.  Therefore  comparing 
the  high  frequency  components  of  actual  U.S  data  to  those  from  the  simulated  data  will  provide 
us  information  about  the  match  of  model  in  frequency  domain. 

Third,  calculate  the  standard  deviations  and  correlations  from  the  spectra  including  only 
the  high  frequency  components.  By  doing  this  in  the  frequency  domain,  theoretically  we  can  get 
the  standard  deviation  and  correlation  contributed  by  any  suggested  frequency  band  width; 
empirically,  with  an  increasing  sample  size,  the  degrees  of  freedom  for  choosing  the  band  width 
will  increase. 

The  final  step  after  this  procedure  is  conventional;  compare  the  estimated  standard 
deviations  and  correlations  from  the  actual  and  model  generated  data  to  see  whether  they  are 
"close". 

We  can  use  Hansen's  (1985)  model  to  explore  the  results  from  this  suggested  procedure. 
First,  Figure  4.5  shows  the  spectra  from  actual  U.S.  data  and  the  model  generated  data  after  being 
filtered  with  k=  10,000.  We  see  in  Figure  4.5.1(a)  and  (b).  that  the  model  simulated  output  series 
has  higher  power  in  almost  every  frequency  than  the  corresponding  spectrum  from  U.S.  data. 
Hence  the  model  simulated  output  series  fluctuates  more  than  the  actual  U.S  output  series.  Figures 
4.5.3  and  4.5.4  are  similar.  For  both  the  investment  and  employment  series,  the  spectra  of  the 
simulated  data  are  generally  above  those  of  actual  data  so  the  standard  deviations  of  investment 
and  employment  from  simulated  data  will  be  larger  than  their  corresponding  from  U.S  data.  Figure 
4.5.2(a)  tells  us  the  opposite  story  is  true  for  consumption.  Here  the  spectrum  of  the  simulated 


65 


O 

■Ti 

O 


o 


a 

T 

o 


Figure  4.5.2(a)  Spectra  of  Consumption  Senes,  All  Frequencies 


a 

T 

o 


66 


Figure  4.5.3(b)  Spectra  of  Investment  Series.  High  Frequencies 


O 

a 


67 


consumption  series  is  always  under  that  of  actual  U.S  consumption,  especially  in  the  lower 
frequencies.  This  indicates  that  the  standard  deviation  of  U.S  consumption  series  is  larger  than  that 
of  the  simulated  series  and  that  the  biggest  differences  occur  for  frequencies  between  jr/16  to  rc/8. 
All  in  all,  it  seems  that  the  match  between  the  power  spectra  of  U.S  data  and  simulated  data  is 
not  bad.  except  for  the  consumption  series.  Next  we  need  to  quantitatively  check  the  fit. 

Table  4.5.1  lists  the  standard  deviations  and  correlations  from  the  U.S  data  and  the 
simulated  data  calculated  using  the  procedure  suggested  above.  The  first  group  in  the  table 
includes  all  fluctuations  with  cycle  length  shorter  than  8 years  (frequencies  higher  than  7t/16).  This 
is  the  business  cycle  part  of  economic  fluctuations  that  real  business  models  should  be  trying  to 
measure  and  explain.  We  see  that  the  standard  deviations  of  the  output  series  are  very  close.  The 
standard  deviations  of  both  investment  and  employment  from  the  simulated  data  are  larger  than 
those  from  the  U.S.  data.  The  worst  match  is  for  the  standard  deviations  of  the  consumption 
series.  The  fluctuation  of  the  simulated  data  is  only  half  that  of  U.S.  data  as  measured  by  the 
standard  deviation.  The  table  also  shows  that  the  correlations  between  output  and  the  other 
variables,  are  generally  a little  higher  with  the  simulated  data. 

The  second  and  third  parts  of  Table  4.5.1  present  the  standard  deviations  and  correlatioas 
that  are  contributed  by  the  frequency  components  with  cycle  lengths  less  than  4 years  and  2 years. 
As  the  length  of  the  cycle  being  considered  decreases,  generally  the  match  of  the  standard 
deviations  and  correlations  deteriorates.  Only  for  consumption  is  the  difference  between  the 
standard  deviation  from  the  actual  U.S  data  and  that  of  the  simulated  data  reduced  when  moving 
to  cycles  with  a shorter  length.  Additionally,  the  table  shows  that  the  correlations  between  actual 
output  and  the  other  variables  decreases  as  shorter  cycles  are  considered.  This  suggests  that  for 


68 


these  very  short  cycle  fluctuations,  changes  in  consumption,  investment  and  employment  are  not 
so  closely  related  to  changes  in  output,  as  are  the  long  cycle  fluctuations.  However,  the 
correlations  from  the  simulated  data  tell  the  opposite  story.  Here  the  correlations  remain  constant 
or  increase  as  the  cycle  length  is  reduced.  Clearly  this  indicates  that  the  simulated  data  are  missing 
some  aspects  of  the  real  world  data. 

Table  4.5.2  and  4.5.3  present  similar  standard  deviations  and  correlations  calculated  using 
values  for  X of  1600  and  400.  While  the  quantitative  standard  deviations  and  correlations  differ, 
it  is  clear  that  the  qualitative  results  in  these  tables  are  similar  to  those  in  Table  4.5.1.  In 
particular,  the  standard  deviations  of  the  actual  output  and  simulated  output  are  quite  close,  though 
they  become  less  so  when  shorter  cycles  are  considered.  The  standard  deviation  of  the  simulated 
consumption  series  remains  well  below  that  of  the  actual  series,  while  the  standard  deviations  of 
the  actual  investment  and  employment  data  are  somewhat  less  than  those  of  the  simulated  data. 
And,  disturbingly,  the  simulated  correlations  between  output  and  the  other  series  diverge  quite 
strongly  from  the  actual  correlations  between  output  and  the  other  data  as  shorter  cycles  are 
considered. 

Even  though  the  results  of  Tables  4.5.2  and  4.5.3  are  similar  to  those  in  Table  4.5.1.  we 
see  that  there  is.  however,  an  important  quantitative  difference.  In  particular,  when  we  filtered  the 
data  using  values  of  X less  than  10.000,  the  standard  deviations  of  the  variables  are  smaller  than 
when  we  used  A.  equal  to  10,000.  This  result  is  what  we  expected,  since  we  know  from  Figure  4.1 
and  4.4  that  as  X becomes  smaller,  there  is  a greater  reduction  in  the  power  of  the  (desirable) 
business  cycle  higher  frequencies. 

In  order  to  exhibit  the  dynamics  of  the  filtered  time  series,  we  draw  the  original  U.S.  GNP 


69 


time  series  and  the  filtered  series  (GNP,  consumption,  investment  and  employment)  in  Figure  4.6. 1 
and  Figure  4.6.2  ~ 4.6.5  respectively.  In  the  Figure  4.6.2  ~ 4.6.5,  the  filtered  time  series  1 
includes  all  the  components  whose  cycles  are  under  8 years;  the  filtered  series  2 includes  all  the 
components  whose  cycles  are  under  4 years;  the  filtered  series  3 only  includes  the  components 
whose  cycles  are  shorter  than  2 years  (some  economists  refer  those  components  to  be  the  noises). 

From  the  comparison  of  Figure  4.6.1  and  4.6.2,  we  can  see  that  the  filtered  GNP  series 
captured  the  major  characteristics  of  business  cycle  fluctuations  of  U.S.  economy  and  also  the 
filtered  series  categorize  the  fluctuations  with  different  cycles.  From  1957  (the  beginning  of  our 
sample)  to  1970,  approximately  all  the  cycles  of  fluctuations  are  not  longer  than  4 years  since  the 
curves  of  filtered  series  1 and  filtered  series  2 are  very  close  to  each  other.  After  that  (from  1970 
to  1988),  the  fluctuations  are  more  persistent.  And  generally  in  the  U.S.  economy,  the  long  cycle 
components  are  more  powerful  than  the  short  cycle  components,  indicating  that  usually  the  long 
cycle  fluctuations  in  U.S.  economy  play  a major  role  in  the  deviation  from  its  growth  track.  It  is 
obvious  that  fluctuations  in  consumption,  investment  and  employment  share  the  same  pattern  as 
GNP  series  from  examining  the  Figure  4.6.2  ~ 4.6.5.  Synthesizing  Figure  4.6.2  to  4.6.4,  it  is  clear 
that  the  long  cycle  components  in  four  time  series  are  more  correlated  than  the  short  cycle 
components.  This  visualized  information  matches  the  numerical  correlations  in  Table  5. 

In  summary,  the  suggested  procedure  has  the  following  points  that  differ  from 
conventional  method:  (1)  start  with  h=  10,000  as  the  smoothing  parameter  for  the  HP  filter  rather 
than  X^=1600.  This  change  retains  more  of  the  high  frequency  components  after  filtering.  Our 
purpose  in  filtering  is  not  to  try  to  reduce  all  the  undesired  low  frequency  components  (as  in  the 
traditional  method)  but  just  to  detrend  the  series,  so  that  the  spectrum  no  longer  has  a severe 


Ht-iiic  -4  6.2  Deirended  aiuj  Filtered  U.S.  GNP  Series 


l^gy*U  i.rnptuym«n|  ^ l uggwU  lnv«»lm«n|  logysJ  Consumption 


71 


P«nod:  from  57.1  " sa.4 


Figure  4.6.5  U.S.  Employment  Sene*.  Detrended  ud  Filtered 


72 


leakage  problem.  (2)  Move  the  work  of  calculating  the  standard  deviations  and  correlations 
from  the  time  domain  to  the  frequency  domain.  By  so  doing,  we  can  easily  identify  the  business 
cycle  components  and  get  statistics  which  are  determined  by  them.  Thus,  the  statistics  from  our 
procedure  can  measure  business  cycle  fluctuations  more  precisely,  and  enable  us  to  better  check 
the  fit  of  the  model  in  this,  the  relevant,  dimension. 

4,5  Summary  and  Conclusions 

Following  King  and  Rebelo  (1993)  and  Watson  (1993),  we  study  the  filtering 
methodology  and  business  cycles  in  the  frequency  domain.  For  models  designed  to  explain  the 
business  cycle,  isolation  of  the  business  cycle  part  of  fluctuations  from  the  trend  part  is  crucial. 
In  pursuing  this  goal,  researchers  need  to  pick  fluctuations  within  a certain  length.  Kydland  and 
Prescott  (1982)  suggest  8 years  as  an  upper  bound  for  business  cycle  lengths  and  hence  propose 
to  use  the  HP  filter  with  a smoothing  parameter  of  1600  to  select  these  frequencies.  While  this 
filter  has  been  widely  applied  for  more  than  ten  years,  until  King  and  Rebelo  (1993)  very  little 
research  focused  on  this  issue.  Theoretically,  King  and  Rebelo  found  that  the  HP  filter  ought  to 
preserve  much  of  the  high  frequency,  business  cycle  components  of  the  data  when  used  to  detrend 
data.  Empirically,  however,  they  found  that  HP  filtering  alters  measurements  of  persistence, 
variability  and  co-movement. 

King  and  Rebelo’s  empirical  work  looked  at  only  the  time  domain  representation  of  the 
data.  We  focus  almost  exclusively  on  the  frequency  domain  representation.  The  frequency  domain 
has  a major  advantage  because  it  enables  us  to  calculate  a precise  measure  of  the  variance  and 


73 


correlations  that  result  from  business  cycle  fluctuations.  Calculating  these  statistics  in  time  domain 
forces  us  to  use  not  only  fluctuations  with  business  cycle  lengths  but  also  fluctuations  that  are 
better  described  as  trend  or  growth  components  of  the  data. 

We  find  that  by  properly  adjusting  the  smoothing  parameter,  the  HP  filter  can  be  set  to 
preserve  a selected  proportion  of  frequency  components  higher  than  a given  level  when  it  is  used 
to  detrend  different  time  series.  But.  even  though  the  HP  filter  can  be  very  good  at  saving  high 
trequency  components  of  the  data,  the  way  it  is  employed  in  real  business  literature  is 
questionable.  For  business  cycle  research,  we  want  to  retain  frequency  components  that  have  the 
cycles  shorter  than  8 year.  With  this  as  our  goal,  the  conventional  use  of  A^=1600  is  not  a sound 
choice  for  the  smoothing  parameter  because  it  cuts  too  much  of  the  business  cycle  components 
from  both  the  U.S  data  and  the  simulated  data. 

To  avoid  these  problems  we  suggest  a new  procedure  to  measure  how  well  real  business 
cycle  models  match  against  actual  data:  1)  Use  the  HP  filter  to  filter  the  time  series  with 
^=10.000;  2)  Calculate  the  standard  deviation  and  correlation  in  the  frequency  domain  and  include 
only  the  components  which  are  thought  to  be  cyclical  part  of  the  fluctuations.  In  this  method,  the 
HP  filter  is  not  used  to  try  to  remove  all  the  undesired  low  frequency  components  but  only  to 
detrend  the  data.  Our  second  step  enables  a more  removal  of  undesired,  low  frequency,  "trend" 
components  of  the  data.  And,  from  the  point  of  view  of  determining  the  match  between  the 
simulated  data  and  actual  data,  our  proposed  method  enables  us  to  emphasize  whatever  frequencies 
the  investigator  wishes  to  stress. 

Taking  Hansen  (1985)’s  model  as  a benchmark,  our  results  suggest  that  in  the  frequency 
band  of  7C/1 6 ~ k (which  implicitly  defines  the  business  cycle  as  fluctuations  with  length  under 


74 


8 years),  the  standard  deviation  match  of  the  output  and  employment  series  from  U.S  data  and  the 
model  are  quite  good,  while  the  match  between  consumption  and  investment  is  not  so  good.  The 
correlation  matches  between  output  and  the  other  data  series  are  reasonably  good.  When  we 
narrow  the  width  of  frequency  band  to  jc/8  ~ n and  further  to  n /4  ~ k,  (with  corresponding  cycle 
lengths  under  4 years  and  under  2 years  respectively),  the  model’s  match  substantially  declines. 
This  is  particularly  true  for  the  correlations  between  output  and  the  other  data  series.  In  the  actual 
data,  these  correlations  fall  in  magnitude  as  shorter  cycles  are  considered.  In  the  simulated  data, 
however,  the  correlations  increase  in  size  so  that  when  very  short  cycles  are  examined  the 
correlations  generated  by  the  simulated  data  are  quite  a bit  different  than  those  generated  by  the 
actual  data. 

Finally,  we  think  the  filtering  method  and  spectral  theory  have  great  potential  in 
macroeconomics  and  time  series  researches.  While  most  studies  of  macro-time  series  rely  on  the 
assumptions  which  are  difficult  to  verify,  the  non-prior-belief  advantage  of  filtering  method 
appears  to  be  very  attractive.  Also,  the  immediate  interpretation  of  power-frequency 
distributionsuggests  that  filtering  methodology  will  be  one  of  the  major  tools  in  economics  time 


series  study. 


75 


Table  4.1  Critical  Values  of  HP  Transfer  Function 


Parameter 

jc/16 

tc/12 

7C/8 

Values  (cycle  length:  8 years) 

(cycle  length:  6 years) 

(cycle  length:4  years) 

\=400 

0.1 

0.4 

0.8 

X=1600 

0.5 

0.8 

0.95 

>1=10000 

0.9 

0.95 

0.99 

Table  4.2 

The  HP  Fltering  Effect  (>.=  1600) 

U.S.  Datat 

Simulated  Data* * 

Unfiltered  Filtered 

Unfiltered 

Filtered 

Variables 

Corr 

Std 

Corr  Std 

Corr 

Std 

Corr 

Std 

output 

15.08 

1.00  2.18 

1.00 

5.04 

1.00 

2.18 

1.00 

(1.28) 

(0.00) 

(0.24) 

(0.00) 

consumption 

0.86 

15.50 

0.99  1.26 

0.77 

3.12 

0.84 

0.64 

(1.14) 

(0.05) 

(0.10) 

(0.04) 

investment 

16.90 

0.96  5.53 

0.92 

13.06 

0.93 

7.10 

0.98 

(2.60) 

(0.03) 

(0.78) 

(0.01) 

employment 

0.98 

3.36 

-0.09  1.55 

0.85 

2.91 

0.83 

1.66 

(0.52) 

(0.07) 

(0.18) 

t The  data  are  provided  by  Mark  W.  Watson,  and  cover  the  period  of  57.1-  88.4. 

* Within  the  parentheses  are  standard  deviations  of  second  moments  from  simulated  data. 


Tabic  4.3.1  Standard  Deviations  in  Percentage 


76 


«2 

£ 


ON 

rn 


ON 


SO 

o 


u 

o 

3 

CT 

a 


g 

5 


I 


vO 


g 


g 

to 


»n 

CN 


O 

rn 

O 


CM 

cm’ 


o 

30 

CM 


ON 

CM 

o 


o 

30 

vri 


r-  g 

o°  d u-) 

CM  w rf 


r- 

30 

CM 


30 

30 

wn 


C* 

< 


£ 

Q£ 


O’ 

m 

d 


CM 

CM 


vn 

o 

CM 


CM 

Tf 


30 

ON 

CM 


30 

Tfr 

CM 


O 

r- 

ON 

CM 


< 

s 

5 

< 

u 


ON 


3 

CM 


s 


vO 

CM 

CM 


03 

c- 


C/5 


0) 

£ 


c/3 


O 

c3 


> 

03 

T3 

"5 

cd 

TO 

I 

c/3 


<u 


> 

03 

TO 

1 

TO 

| 

v5 


03 

C- 


i 


a 

id 


c/3 

03 


a 

a 

id 

CL 


* The  population  standard  deviations  are  computed  from  the  theoretical  models  in  the  frequency  domain  skipping  zero  frequency. 


Table  4.3.2  Standard  Deviations  in  Percentage  ot  High  Frequency  Part 


77 


* Sid  2 signifies  the  standard  deviations  those  are  presented  by  the  cycles  whose  lengths  are  less  than  8 yeti 


Tabic  4.4  Second  Moments  of  Filtered  Serie 


78 


t 

U 


2 


“O 

O 

2 

2 

cn  on 


cS 

TD 


on 

D 


*2 

<75 


Cfl 

2 

jd 

'i 

> 


o 

4J 


3 

c. 


o 

4J 

3 


13 

> 


8 3 

nr 

O 

8 

C 

o 

o 

O 

d 

8 

x 

oo 

ON 

VO 

ON 

8 

o 

o 

o 

en 

oo 

m 

■*? 

•? 

m 

O 

CM 

(N 

d 

O 

*-• 

o 

o 

un 

rT 

CN 

00 

VO 

Ov 

o 

ri 

o 

OO 

— 

cn 

SX  (N  h 
: o\  OO 


VO  CO  Vs  K-) 
vs  vq  <n  oo 
psi  — vd  — ; 


c — 

3 w C 

•s  e « 

a.  u g 

as! 

<D  C- 

2 E 

2 d) 


t = 

§ 8 


3 3 3 
3 0 0 
VO  C\  oo 
oo  a\  o\ 

odd 


3 oo  oo 

odd 


3 r-~ 


oo  VO  fN  VO 

— CN  VS  V) 

CN  --  vd  — < 


c — 
O w 3 
'2  c ^ 
ft  u E 
= = >; 
3 3 £ .2 
a 2 o a, 
3 2 £ E 

3 3 .3  u 


8 CO  — — 

3 3 0 

dodo 

8av  OV  Ov 
oo  CTV  Ov 
— odd 


CT\  OO  vs  vs 

— O VO  — 

o d o o' 

S'— ' 'w'  'w'  S„ ✓ 

m o O m 
» <n  — 

- o vo  - 


r- 

CN 

\T) 

3 

r- 

, 

m 

r- 

CT\ 

OO 

r- 

On 

oo 

O 

* 

O 

d 

o 

VO 

3 


tj-  p-  vo 

« OV  Tt  tN 

— d rt  — ' 


3 _ C 
3 ft  “ 
G.  40  £ 

w E S =-> 
3 3 5 .2 
p.  2 « c. 

3 3 g S 

3 CJ  .3  sj 


3 


cd 

*C 

> 

OX) 

c 

•5 

a 

a 

C/3 

ii 

t 

3 

« «* 

■o  15 

5 ■= 

W to 
3 « 

Cu  S3 

1 1 

^ « 
c 5 

% i 

V3 

C/3  — 

3 « 

'5  E 

w o 


o -2 
“ o 

T3  CJ 

q (u 

3 C/J 
C/3  W 

C -2 

C ^ 
•s  'p 
& o 

> 2 
o £ 
•a  .2 
•g  3 

Cd  ’> 
T3  O 
“1  TO 

2,1 
-a 

SJ  I 

w cn 
gj  _u 

& §> 
K 1 

D 

S3  « 

103  -3 

t E 
o « 

U 03 

■9  g 
<u 

5 

S3 

a § 

C/3  S3 

40  °* 

5 
c 
§ 
£ 


79 


Table  4.5.1 


Variables 

Std 

U.S  data 

Corr 

Simulated  data 
Std 

Corr 

X^IOOOO 

Group  1 (Cycle  length  less  than  8 years) 

output 

2.29 

1.00 

2.30 

1.00 

consumption 

1.29 

0.75 

0.68 

0.86 

investment 

6.00 

0.93 

7.48 

0.99 

employment 

1.62 

0.85 

1.75 

0.99 

Group  2 (Cycle  length  less  than  4 years) 


output 

1.59 

1.00 

1.65 

1.00 

consumption 

0.75 

0.71 

0.42 

0.92 

investment 

3.38 

0.90 

5.42 

0.99 

employment 

1.00 

0.80 

1.26 

0.99 

Group  3 (Cycle  length  less  than  2 years) 


output 

0.76 

1.00 

1.13 

1.00 

consumption 

0.39 

0.48 

0.28 

0.94 

investment 

1.65 

0.71 

3.74 

0.99 

employment 

0.58 

0.66 

0.87 

0.99 

80 


Table  4.5.2 


U.S  data 

Simulated  data 

Variables 

Std 

Corr 

Std 

Corr 

^=1600 

Group  1 (Cycle  length  less  than  8 years) 

output 

2.12 

1.00 

2.09 

1.00 

consumption 

1.18 

0.77 

0.59 

0.87 

investment 

5.45 

0.93 

6.84 

0.99 

employment 

1.50 

0.85 

1.60 

0.99 

Group  2 (Cycle  length  less  than  4 years) 


output 

1.56 

1.00 

1.60 

1.00 

consumption 

0.74 

0.71 

0.40 

0.93 

investment 

3.32 

0.89 

5.31 

0.99 

employment 

0.98 

0.79 

1.24 

0.99 

Group  3 (Cycle  length  less  than  2 years) 


output 

0.76 

1.00 

1.12 

1.00 

consumption 

0.39 

0.47 

0.26 

0.96 

investment 

1.65 

0.70 

3.70 

0.99 

employment 

0.58 

0.67 

0.87 

1.00 

81 


Table  4.5.3 


U.S  data 

Simulated  data 

Variables 

Std  Corr 

Std 

Corr 

\=400 


Group  1 (Cycle  length  less  than  8 years) 


output 

1.82 

1.00 

1.84 

1.00 

consumption 

0.95 

0.77 

0.48 

0.90 

investment 

4.44 

0.91 

6.04 

0.99 

employment 

1.26 

0.83 

1.42 

0.99 

Group  2 (Cycle  length  less  than  4 years) 


output 

1.50 

1.00 

1.56 

1.00 

consumption 

0.71 

0.71 

0.38 

0.93 

investment 

3.19 

0.89 

5.15 

0.99 

employment 

0.94 

0.78 

1.21 

0.99 

Group  3 (Cycle  length  less  than  2 years) 


output 

0.75 

1.00 

1.11 

1.00 

consumption 

0.38 

0.44 

0.26 

0.96 

investment 

1.61 

0.69 

3.68 

0.99 

employment 

0.57 

0.65 

0.86 

0.99 

CHAPTER  5 
CONCLUSIONS 

In  a strictly  exogenous  cointegration  model  expressed  in  equations  (2.1)  ~ (2.3),  the 
student-t  distribution  provides  poor  coverage  of  the  confidence  interval  for  the  OLS  estimator.  The 
proposed  bootstrap  percentile-t  method  does  provide  reliable  confidence  intervals. 

In  more  general  cointegration  models  with  endogenous  regressors,  a low-pass  filter 
method  for  estimating  cointegrating  vectors  has  been  suggested  in  Chapter  3.  It  has  been  proved 
that  the  proposed  Filtered  Least  Squares  estimator  is  asymptotically  more  efficient  than  the  OLS 
estimator,  and  the  Filtered  Fully  Modified  Least  Squares  estimator  shares  the  asymptotic  efficiency 
with  the  Fully  Modified  Least  Squares  estimator.  The  Monte  Carlo  study  suggest  that  both  filtered 
estimators  are  significantly  more  efficient  than  the  OLS  and  FMLS  estimators  in  finite  samples. 
Since  the  Filtered  least  squares  estimator  is  more  efficient  than  the  OLS  estimator,  and  in  practice, 
the  OLS  is  widely  used  as  a first  stage  estimator  for  some  efficient  estimators,  some  improvement 
will  be  expected  by  using  the  filtered  Least  Squares  estimator  as  a more  efficient  fist  stage 
estimator. 

By  examining  the  difference  between  using  a difference  filter  and  the  HP  filter,  we  find 
that  (1)  The  high-pass  HP  filter  has  the  nice  property  of  recovering  the  high  frequency  component 
for  not  only  a stationary  series  but  also  for  a nonstationary  series.  (2)  A frequency  domain  analysis 
shows  that  the  first  difference  filter  is  not  very  good  at  revealing  the  short  run  dynamics  of  a 
nonstationary  series.  (3)  The  second  moments  (standard  deviations  and  correlations)  calculated 
from  the  high-pass  HP  filtered  series  (A  =1600)  are  not  precise  enough  to  capture  the  features  of 


82 


83 


the  high  frequency,  business  cycle  component  of  the  time  series.  This  last  result  occurs  because 
these  statistics  contain  some  undesirable  low  frequency  components,  whose  power  may  be  very 
strong  for  a nonstationary  series.  Based  on  the  analysis,  we  propose  a two  step  band-pass  filtering 
procedure  to  measure  the  fit  of  real  business  cycle  models.  By  conducting  this  procedure,  we  can 
obtain  the  standard  deviations  and  the  correlations  contributed  by  any  suggested  frequency  band 
width.  Finally,  I point  out  that  the  procedure  suggested  here  is  applicable  not  only  to  measuring 
the  fit  of  calibrated  models,  but  also  to  cases  that  require  the  detrending  or  the  isolation  of  a 
certain  range  of  frequencies  without  imposing  any  prior  specifications. 


APPENDIX 

PROOFS  OF  LOW-PASS  FILTERED  LEAST  SQUARES  ESTIMATORS  OF 
COINTEGRATING  VECTORS 

Proof  of  Proposition  3.2.2:  We  first  show  that  ylt  is  1(1). 

We  construct  a high-pass  filter  that  is  orthogonal  to  h(L),  q(L)=l-h(L).  It  is  easy  to  see 
that  the  new  filter  is  also  symmetric.  With  the  symmetric  property  and  (b)  of  Assumption  3.2.1. 
h(l)=l.  we  are  ready  to  decompose  the  high-pass  filter.  Our  decomposing  approach  follows  Baxter 
and  King  (1994)  in  general,  with  the  main  differences  arising  from  the  issue  of  an  infinite  moving 
average.  By  assumption  h(l)=i.  we  can  deduce  q(l)=0.  Thus  we  can  write 

K K K 

Q„{L)  slim  2 q^L  *=lim  2 (gJL  k-qk)  =lim  2 qk  (L  k+L  'k-2 ) 

X—  X=-X  x—  x=-x  X—  x=  1 


The  first  equality  follows  from  the  property  of  2 qk=0  . The  second  follows  from  the 


symmetric  property.  qk  = q_k.  It  is  easy  to  see  , 

(Lk+L-k-2)  =- (1-Lk)  (l-L-k)  . 


[1+L1+L2  + . . . +Lk~1]  [1+L  _1+L  '2+  . . . L ] = ( 21’  (k-\h\)Lh 

h=- (k-l) 

We  observe  that  (l-L*)  = (1-L)  [1+L1+L2+  . . . ] and 


84 


85 


[1+L1  +L2  + . . . +Lk~x]  [1 +L -1+L-2  + . . . +L_(*C~1)]  = ‘s1’  (k-\h\)Lh 

h=-{k-l) 


Hence,  we  can  rewrite  (L)  as 


K K 

g.(L)  =lim  E q^  ic=  — l±m  2 qk[(l-Lk)  (1  -L~k)  ] =-(l-L)  (1  -L'1)  tlx  (L) 

K—  k=-K  K-~  Jc=l 


where 

^.(L)  = E qk  E (k-  ! h\ ) Lh=  E gkV  (k-h)  (L h+L'h)  and  is  a 

Jc=l  h-~  ( Jc_l)  k=l 

symmetric  infinite  moving  average.  Further  we  define  the  rk ’s  as  the  coefficients  of  the 
Polynomial.  ljf-(L)  . i.e.  ^ (L)  = E rk  (L  k+L  -k)  ■ 


With  a little  bit  of  algebra,  we  can  show 


ic-l 


rk=qk2  (k-h)  zqk(l+2  + . . . +k-l) 


a= i 


=<7jt[-|  (ic(ic-l) 


Assumption  3.2.1(c)  implies  that  q*  =ck  (c  < 1).  Thus,  it  is  straightforward  that 
qk  [ ^k  (ie-1)  ] -0  . as  k -><«.  Note  (rk  } can  be  shown  to  be  absolutely  summable. 


86 


Therefore,  ^ ( L)  = S r^L  h converges  to  a well  defined  autocovariance-generating  function. 

k- 1 


To  this  end,  we  can  see  that  the  high-pass  filter,  q_  (L)  contains  (at  least)  two  differences, 
i.e..  it  could  render  stationary  a nonstationary  process  up  to  order  of  1(2).  In  other  words,  q„  (L) 
extracts  only  a stationary  component  from  (ylt  ),  an  1(1)  process.  In  terms  of  the  Beveridge- 
Nelson  decomposition  theory,  a nonstationary  component  of  the  1(1)  process  is  left  by  applying 
the  high-pass  filter  to  (yIt  },  or  alternatively,  we  can  say,  the  low-pass  filter,  h_  (L)  extracts  an 
1(1)  process  from  (ylt  ).  Similarly,  we  can  show  y2t  is  an  1(1)  process. 

Since  u,  is  an  1(0)  process,  applying  the  low-pass  filter  will  not  alter  the  stationarity  of  u,. 

Q.E.D. 


Proof  of  Corollary  3.2.3:  It  is  equivalent  to  show  these  filters  satisfy  assumption  3.2.1. 

(a)  King  and  Rebelo  (1993)  pointed  out  that  0 is  real  and  less  than  one  for  any  A>0.  For 
completeness,  we  show  the  proof.  Solving  the  limiting  version  of  the  first  order  conditions  for 
(10),  we  get  a polynomial  in  L.  F::s  (L), 

F es  (L)  — ~XL  1 + (2. +2X.)  - X.L  (A.31) 

= [A  (1-L)  (1  -L'1)  +1] 

We  define  0 to  be  the  smaller  root  of  F^  (L).  With  symmetry  property,  we  have 
Fes  (0)  =FSS(0-1)  =0  . Thus,  from  (A.l), 


FES(L)  [~XLZ+ (2k  + l)  L-K]  =0  , L* 0 

J—i 


(AJ.2) 


87 


Solving  (A.3.2),  we  have 

0={  (2A+1)  - [ (2A.  + 1)  2-<U2] 1/2}/  (2k) 

It  is  rather  straightforward  to  show  o < 0 < 1 • Hence,  this  is  sufficient  for  all  conditions  in 
Assumption  3.2.1. 

(b)  It  can  be  shown  (see  Appendix  A in  King  and  Rebelo  (1989))  that  the  coefficients  in 
the  HP  low-pass  filter  can  be  expressed  as 

-2 Rr  jcos  (M+jm) 

where  r,m.  R,  and  M are  related  to  9,  , 9-,  , A(  and  A:  by  the  polar  form  representation 

01=r  exp  (im)  , 02=r  exp  ( -im)  , 

A l=R  exp  ( iM)  , A,  = R exp  ( -iM)  . 

It  is  obvious  that  there  are  an  infinite  number  of  iy  s which  are  nonzero  as  j approaches  infinity. 
Thus,  condition  (a)  is  satisfied.  Because  |9,  | < 1,  r <1.  and  |cos(M+jm)  I < 1,  condition  (c)  is 
satisfied.  Condition  (b)  follows  immediately.  This  completes  the  proof. 

Proof  of  Proposition  3.3.1.  From  (3.8)  we  find  that 

r(  Pols-P)  = ( T-2Sjr2y2t)  -i  ( T-^y2CVlt)  ( A.3.3) 


By  substituting  y2t-hp  (L)  x2t  *nt0  first  factor  in  the  right-hand  elements  of  (A.3.3),  we 


88 


have 


T-2Hy2y2t=T-2i  ( £ HTXzt_r)  ( l 

t=l  r=-p  s-~p 

= 7’-1  I £ Hth'3c22  ( n ) 

r--ps=-p 


where 


c22(n)  =T-1S^_cx/2 


/3*1 


2 £+n 


Similarly, 


T-l2yztvlt=T-'i  ( £ i*At_r)  ( S A 


t=l  r=-p 


3-~p 


SV It 


= £ £ Hrhsc2  x(n) 

r=-ps=-p 


where 


C2.1^)  =7,_1  2 
72*1 


C+/3  ' 


Thus,  we  have 

T’(Pols-P)  ={^HrHs  T-1c22(n))-1(^Hrh3  c21{n))  (AJ.4) 

The  next  step  is  to  determine  the  asymptotic  behavior  of  A.4.  By  Phillips’  ( 1 99 1 ) Theorem 


3.1, 


89 


T'xc22  (n)  -J*  B2B2 


C21  (^2)  — f 1 B2dB, 

J o 

Next,  we  observe  p — » as  T — > By  the  assumption,  we  let 

22tr=c  <°°  ■ 


We  get 


2 2 hrhs=  (2  hr)  ( 2 hs)  =c2<°°. 

r-»  s=-« 


Similarly  we  have 


2 


2 2 Hrifs=Jmc2  and 


2 2 Hrh5-Imc2 


Thus  we  deduce  that 

r(PoLs-P)~((^2)  rB2B±)-'(Umc2)  f'B.dB,) 

J 0 Jo 

The  proposition  follows  immediately.  Q.E.D. 

Proof  of  Proposition  3.3.2:  This  follows  the  same  lines  as  the  proof  of  the  first  part  of  Theorem 
5.1  of  Phillips  and  Hansen  (1990). 


90 


Proof  of  Proposition  3.3.3. 


w=  (*P FMLB-r) ' (R  var(  0^)  R ')  -i 
= (^PjwLs-r)  '(*  (Sy2ty2t)  -^0  -1  (JZP^s-r) 

= (j?r(PraLS-p))'(i?(r-2Sy2£y2/t)-1i?/)  -Mizrtp^-P)) 


By  Proposition  3.3.2  and  the  results  from  the  proof  of  proposition  3.1,  we  get 
W-  [ ( |q1S2B2/  ) -1  ( £b2  dB*)]/R'(R(c V 1 ) i?  [ ( [^B2B' ) ( f^B2  dB’ 


(A.3.5) 


Observing  that  conditional  on  G2=o  (B2  (r)  , 0<;r^l)  - we  have 


i?|  ( \^B2Bi ) -1  ( f^B2 dB* ) 1 1 ^ =N ( 0 , |i?  ( |q1B2B2/ ) j ) 


so  that  conditional  on  G2 , A.5  is  x'(9)-  Because  this  distribution  is  independent  of  G2,  the  result 
holds  unconditionally.  Thus,  the  proposition  is  deduced.  Q.E.D. 


An  RBC  Model  with  Indivisible  Labor 


In  this  part  of  the  appendix  we  briefly  introduce  the  one-sector  stochastic  growth  model 
with  indivisible  labor  of  Hansen  (1985).  The  economy  is  populated  by  a continuum  of  identical 


91 


infinitely  lived  households  with  names  on  the  closed  interval  [0,1].  The  households  solve  the 
following  problem: 

OB 

max  £ £ pc  u(ct,  at)  , given  k0,  ^ (AA1) 

with  u{ct,  at)  = log  (ct)  +Aat  log  (l-h0) 


subject  to 


ct  + it  * wt  «t  tiQ  + 1 1 kt 

and  ict+1  = (1-5)  kt  + it 


(A.4.2) 


Where  u(c,l-h)  is  the  utility  function,  c,  denotes  consumption,  hg  is  the  indivisible  labor  input, 
a,  is  the  probability  of  getting  a job  in  period  t,  i,  is  investment,  k,  is  capital.  A0  is  the  initial  stock 
of  technology,  rt  is  capital  return,  w,  is  wage  rate  and  5 is  capital  depreciation  rate.  There  is  a 
single  firm  with  access  to  technology  described  by  a standard  Cobb-Douglas  production  function 
of  the  form: 

F(At,  kt,  ht)  = AtK*th*t  (A. 4.3) 


with  h,=  oc,  hg  and  A,  following  the  law  of  motion: 


log  At+1=  ylog  At  +ec 


(A.4.4) 


92 


where  e,’s  are  normally  distributed  iid  with  zero  mean  and  variance  of  cr.  Also,  for  the  economy 
there  is  a resource  constraint: 

ct  + it  ± F(At.  Kt.  ht)  (A.4J) 


The  theoretical  solution  of  above  model  is  extremely  difficult,  so  a so-called  linear-quadratic 
approximation  method  is  employed,  which  is  proposed  by  Kydland  and  Prescott  in  their  influential 
paper  of  1982.  The  values  of  the  parameter  0,  5,  (3,  A,  y,  and  cr  are  assigned  based  on 
empirical  evidence  and  common  sense.  They  are  respectively  0.36,  0.025,  0.99,  2,  0.95,  0.53  and 
0.00712. 


"Spectra"  of  ARIMA(p,d.q)  Models 


Consider  an  ARMA(p,q)  process, 

0p  (B)  xt  =4^(5)  ec  (A.4.6) 


where  0 (g)  and  (j)  (5)  satisfy  stationary  and  invertible  conditions  respectively.  From  (12) 


and  (13)  and  the  fact  that  the  spectrum  of  the  white  noise  process,  e,  is  cr/2rt,  it  follows  that 


rx(co! 


4>7(^) 

0p(*> 


W2-^ 


(A.4.7) 


93 


It  is  clear  that  the  ARMA(p,q)  process  has  a finite  spectrum.  Similarly,  an  ARIMA(p,d, 
q)  process  can  be  written  as 

9P(B)  (l-B)dyc=4>7(B)ec  (A.4.8) 


In  terms  of  the  terminology  of  filter  theory  this  is  merely  a filtered  version  of  the  ARMA(p,q) 
process,  and  thus  from  (12),  (13),  and  (B.2),  it  follows  that 


ry(u>)  =|(1  -Z)  «-2d 


4>g(^) 

0p(z) 


( A.4.9) 


i.e. 


ry(o>) 


[2  (1-coso))  ] d 


V£)_|,2_o[ 

0p(z)  " 271  ' 


z=e'iu> 


( A.4.10) 


It  is  clear  here  that  the  ARJMA(p,d,q)  process  has  an  infinite  power  at  zero  frequency  and  finite 
power  elsewhere.  The  theory  of  spectral  analysis  does  not  guarantee  that  one  can  get  reliable 
Fourier  transformation  of  a nonstationary  time  series  that  approaches  its  frequency  counterpart, 
the  spectrum.  Granger  (1964)  points  out  that  the  power  spectrum  of  a deterministic  trend  is  zero 
everywhere  except  for  a jump  at  the  origin  and  the  size  of  the  jump  is  proportional  to  the  sample 
"variance"  of  the  trend.  Thus,  it  is  clear  that  the  power  spectrum  of  a stochastic  process  with 
deterministic  trend  is  identical  to  that  of  the  same  stochastic  process  without  a deterministic  trend 


94 


except  for  a jump  at  the  origin.  Therefore,  the  "non-zero-frequency"  spectrum  of  a ARIMA  model 
with  drift  is  the  same  as  the  non-zero-frequency"  spectrum  of  corresponding  one  without  drift. 
Since  UCARIMA(p,d,q)  and  ARIMA(p.d.q)  are  observationally  equivalent,  they  have  same  "non- 
zero-frequency" spectrum. 


REFERENCES 


Banerjee,  A.,  J.  Dolado,  D.F.  Hendry,  and  G.  Smith  (1986),  Exploring  equilibrium  relationship 
in  econometrics  through  statistic  models:  Some  Monte  Carlo  evidence.  Oxford  Bulletin 
of  Economics  and  Statistics.  Vol.  48,  pp.  253-277. 

Bils.  Mark  and  J.  Cho  (1993),  Cyclical  Factor  Utilization.  Institute  for  Empirical 
Macroeconomics.  Discussion  Paper  79,  Federal  Reserve  Bank  of  Minneapolis. 

Boswijk,  H.  Peter  (1989),  Estimation  and  testing  for  cointegration  with  trended  variables:  A 
comparison  of  a static  and  a dynamic  regression  procedure.  Working  Paper,  University 
of  Amsterdam. 

Cho,  J.  (1990),  Money,  Nominal  Contracts,  and  the  Business  Cycle:One-period  Contract  Case. 
Working  Paper,  University  of  Rochester. 

Cho,  J and  L.  Phaneuf,  (1993),  A Business  Cycle  Model  with  Nominal  Wage  Contracts  and 
Government.  Discussion  Paper  80,  Institute  for  Empirical  Macroeconomics,  Federal 
Reserve  Bank  of  Minneapolis. 

Cho,  J.  and  T.  F.  Cooley  (1990),  Employment  and  Hours  Over  The  Business  Cycle.  Working 
Paper,  University  of  Rochester. 

Choi,  I.  (1990),  Most  U.S  Economic  Time  Series  Do  Not  Have  Unit  Roots:  Nelson  and 
Plosser’s  Results  Reconsidered,  Manuscript,  Ohio  State  University. 

Christiano,  L.  J.  and  M.  Eichenbaum  (1992),  Current  Real-Business-Cycle  Theories  and 
Aggregate  Labor-Market  Fluctuations.  American  Economic  Review,  82,  430-450. 

Cooley,  T.  F.  and  G.  D.  Hansen  (1989),  The  Inflation  Tax  in  a Real  Business  Cycle  Model. 
American  Economic  Review.  79,  733-748. 

Efron,  B.  (1979),  Bootstrap  methods:  Another  look  at  the  jackknife.  The  Annals  of  Statistics.  7. 
1-26. 


Efron,  B.  (1981),  Censored  data  and  bootstrap.  Journal  of  the  American  Statistical 
Association.  76,  312-319. 

Efron,  B.  (1985).  Bootstrap  confidence  intervals  for  a class  of  parametric  problems.  Biometrica. 


95 


96 


72,  45-58. 


Eichenbaura,  M.  and  K.  J.  Singleton  (1986),  Do  equilibrium  real  business  cycle  theories 
explain  postwar  U.S.  business  cycles?  NBER  91-135. 

Engle,  R.  (1974),  Band  spectrum  regression.  International  Economic  Review,  15,  1-11. 

Engle.  R.,  and  C.  Granger  (1987),  Cointegration  and  error  correction:  Representation,  estimation 
and  testing.  Econometrica.  55,  251-276. 

Evans.  C.  L.  (1992),  Productivity  shocks  and  real  business  cycles.  Journal  of  Monetary 
Economics,  29  , 191-208. 

Gonzalo,  J.  (1994),  Five  alternative  methods  of  estimating  long-run  equilibrium  relationships. 
Journal  of  Econometrics.  60,  203-233. 

Granger.  C.  (1964),  Spectral  Analysis  of  Economic  Time  Series.  Princeton  University  Press. 

Hakkio,  C.  and  M.  Rush  (1991),  Cointegration:  how  short  is  the  long  run?  Journal  of 
International  Money  and  Finance.  10,  571-581. 

Hall.  P.  (1988),  Theoretical  comparison  of  bootstrap  intervals.  The  Annals  of  Statistics.  16,  927- 
953. 

Hamilton,  J.  D.  (1994),  Time  Series  Analysis.  Princeton  University  Press. 

Hansen.  B.E.  and  P.C.B.  Phillips  (1990),  Estimation  and  inference  in  models  of 
cointegration:  A simulation  study.  Advances  in  Econometrics.  8,  225-248. 

Hansen,  G.  D.  ( 1985),  Indivisible  labor  and  the  business  cycle.  Journal  of  Monetary  Economics. 

16,  309-327. 

Hansen,  G.  D.  and  T.  J.  Sargent  (1988),  Straight  time  and  overtime  in  equilibrium.  Journal  of 
Monetary  Economics.  21,  281-308. 

Harvey,  A.  (1981),  Time  Series  Models.  John  Wiley  & Sons,  New  York. 

Harvey,  A.,  E.  Ruiz  and  E.  Sentana  (1992),  Unobserved  component  time  series  models  with 
ARCH  disturbances.  Journal  of  Econometrics.  52.129-157. 

Hodrick,  R.  and  E.  Prescott  (1980),  Post-war  U.S.  business  cycles:  An  empirical 
investigation.  Working  Paper,  Camegie-Mellon  University,  Pittsburgh,  PA. 

Jenkins.  G.  M.  and  D.  G.  Watts  (1968),  Spectral  analysis  and  its  applications.  Holden-Day,  San 
Francisco,  CA. 

Jeong,  J.,  and  G.S.  Maddala  (1993),  A Perspective  on  application  of  bootstrap  methods  in 
econometrics.  Handbook  of  Statistics.  Vol.ll:  Econometrics,  North-Holland. 


97 


Johanson,  S.  (1988),  Statistical  analysis  of  cointegration  vectors.  Journal  of  Economic  Dynamirs 
and  Control.  12,  231-254. 

Kalman,  R.  (1960),  A new  approach  to  linear  filtering  and  prediction  problems.  Journal  of  Basic 
Engineering,  Transactions  of  the  ASME  Series  D.  82,  35-45. 

Kim,  I.  and  P.  Loungani  (1992),  The  Role  of  energy  in  business  cycle  model.  Journal  of  Monetary 
Economics,  29,  173  - 189. 


King,  R.  G.,  C.  I.  Plosser,  J.  H.  Stock  and  M.  W.  Watson  (1991),  Stochastic  trends  and  economic 
fluctuations.  American  Economic  Review.  81,819-840. 

King,  R.,  and  S.  Rebelo  (1993),  Low  frequency  filtering  and  real  business  cycles.  Journal  of 
Economics  and  Control.  17,  207-231. 

King,  R.  G.,  C.  I.  Plosser,  and  S.  Rebelo,  (1988a),  Production,  growth,  and  business  cycles:  I. 
The  basic  neoclassical  model.  Journal  of  Monetary  Economics.  21,  195-232. 


King,  R.,  C.  I.  Plossers  and  S.  Rebelo  (1988b),  Production,  growth,  and  business  cycles:  II.  The 
basic  neoclassical  model.  Journal  of  Monetary  Economics.  21,  309-341. 

Kydland,  F.  E.  and  E.  Prescott  (1982),  Time  to  build  and  aggregate  fluctuations. 
Econometrica.  50,  1345-1371. 

Li,  Y„  (1994),  Bootstrapping  cointegrating  regression.  Economics  Letters.  44,  229-233. 

Li,  H.  and  G.S.  Maddala  (in  press).  Bootstrapping  time  series  models.  Econometric  Review. 

Li,  Y.,  G.S.  Maddala,  and  M.  Rush  (1995),  New  small  sample  estimators  for  cointegration 
regression:  Low-pass  spectral  filter  method.  Economics  Letters.  47,  123-129. 

Lippi,  M.  and  L.  Reichlin  (1993),  The  dynamic  effects  of  aggregate  demand  and  supply 
disturbances:  Comment.  American  Economic  Review.  83,  644  - 652. 

Lucas,  R.  (1980),  Two  illustrations  of  the  quantity  theory  of  money.  American  Economic 
Review  70,  1345-1370. 


Mankiw,  George  (1989),  Real  business  cycles:  A new  keynesian  perspective.  Journal  of  Economic 
Perspectives.  3,  79-90. 

Nelson,  C.  K„  and  C.  I.  Plosser  (1982),  Trend  and  random  walks  in  macroeconomic  time  series. 
Journal  of  Monetary  Economics.  10,  139  - 162. 

Park,  J.  (1992),  Canonical  cointegrating  regressions.  Econometrica.  60,  119-143. 


Park,  J.  and  P.  C.  B.  Phillips  (1988),  Statistical  inference  in  regressions  with  integrated  process: 
Part  1.  Econometric  Theory.  4,  468-497. 


98 


Park,  J.  and  P.  Phillips  (1989),  Statistical  inference  in  regressions  with  integrated  process: 

Part  2.  Econometric  Theory.  5,95-131. 

Phillips,  P.  (1991a),  Optimal  inference  in  cointegrated  systems,  Econometrica  59,  283-306. 

Phillips,  P.(1991b),  Spectral  regression  for  cointegrated  time  series,  in:  W.  Barnett,  ed„ 
Nonparametric  and  Semiparametric  Methods  in  Economics  and  Statistics,  Cambridge 
University  Press,  Cambridge,  pp  413-436. 

Phillips,  P.  and  B.  Hansen  (1990),  Statistical  inference  in  instrumental  variables  regression 
with  1(1)  Process.  Review  of  Economic  Studies.  57,99-125. 

Phillips,  P.  and  S.  Duriauf  (1986),  Multiple  time  series  regression  with  integrated  process. 
Review  of  Economic  Studies,  53,  473-496. 

Plosser,  C.  (1989),  Understanding  real  business  cycle.  Journal  of  Economic  Perspectives.  3,  51- 
77. 

Prescott,  E.  (1986),  Theory  ahead  of  business-cycle  measurement  Camegie-Rochester  Conference 
Series  on  Public  Policy,  25,  11-44. 

Priestley,  M.  (1988),  Non-linear  and  Non-stationary  Time  Series  Analysis,  Academic  Press,  New 
York,  NY. 


Priestley,  M.  B.  (1981),  Spectral  Analysis  and  Time  Series.  Vol.  1 and  2,  Academic  Press.New 
York,  NY. 

Rush,  M.  and  S.  Husted  (1985),  Purchasing  power  parity  in  the  long  run.  Canadian  Journal  of 
economics.  18,  137-145. 

Shea.  G.S.,  (1989),  Ex-post  rational  price  approximations  and  the  empirical  reliability  of  the 
present-value  relation.  Journal  of  Applied  Econometrics.  Vol.  4,  139-159. 

Singleton,  K.  J.  (1988),  Econometric  issues  in  the  analysis  of  equilibrium  business  cycle  models. 
Journal  of  Monetary  Economics.  21,  1988,  361-386. 

Stock,  J.  and  M.  Watson  (1988a),  Testing  for  common  trends.  Journal  of  the  American  Statistical 
Association  83.  1097-1107. 

Stock,  J.  H.  and  M.  W.  Watson  (1988b),  Variable  trends  in  economic  time  series.  Journal  of 
Economic  Perspectives.  2,  147-174. 

Stock,  J.  (1987),  Asymptotic  properties  of  least  squares  estimator  of  cointegrating  vectors. 
Econometrica.  55,  381-386. 

Vinod,  H.D..  and  B.D.  McCullough,  (1991),  Bootstrapping  cointegrating  regression.  Paper 
presented  at  Statistics  Canada  Meetings,  Montreal. 


99 


Watson,  M.  W.  (1993),  Measures  of  fit  for  calibrated  models.  Journal  of  Political  Economy 
101,  1011-1041. 

Wiener,  N.  (1949),  Extrapolation.  Interpolation  and  Smoothing  of  Stationary  Time  Series. 
Wiley,  New  York. 


BIOGRAPHICAL  SKETCH 


Yikang  Li  was  bom  in  Shanghai,  China,  on  the  8th  of  June,  1953.  Ln  1983,  he  received 
B.  A.  degree  in  economics  from  Shanghai  University  of  Finance  and  Economics.  After  he 
graduated  in  1986  from  Shanghai  University  of  Finance  and  Economics  with  an  M.  A.  in 
economics,  he  took  a position  of  Assistant  Professor  at  the  University,  and  worked  as  vice 
chairman  of  the  Department  of  Statistics  at  the  same  University  from  1986  to  1989.  In  1990,  he 
entered  the  doctoral  program  in  economics  at  the  University  of  Florida  and  anticipates  completing 
his  doctoral  studies  in  1995. 


100 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  acceptable 
standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality,  as  a dissertation 
for  the  degree  of  Doctor  of  Philosophy. 

(L& 

Mark  Rush,  Chairman 
Professor  of  Economics 

I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  acceptable 
standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality,  as  a dissertation 
for  the  degree  of  Doctor  of  Philosophy. 

■ 

William  Bomberger  ^ 

Associate  Professor  of  Economics 

I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  acceptable 
standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality,  as  a dissertation 
for  the  degree  of  Doctor  of  Philosophy. 


David  Dens  low 

Distinguished  Service  Professor  of  Economics 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  acceptable 
standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality,  as  a dissertation 
for  the  degree  of  Doctor  of  Philosophy. 

° to 


i 


awrence  Kenny 
Professor  of  Economics 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  acceptable 
standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality,  as  a dissertation 
for  the  degree  of  Doctor  of  Philosophy. 


Barrii 


annery 

Banks  Eminent  Scho 


of  Finance 


This  dissertation  was  submitted  to  the  Graduate  Faculty  of  the  Department  of  Economics 
in  the  College  of  Business  Administration  and  to  the  Graduate  School  and  was  accepted  as  partial 
fulfillment  of  the  requirements  for  the  degree  of  Doctor  of  Philosophy. 

August,  1995 


Dean,  Graduate  School 


